fix(llm): pass existing folder cases into outline prompt to avoid duplicates by therealbrad · Pull Request #335 · TestPlanIt/testplanit

therealbrad · 2026-05-22T19:33:38Z

Description

When a user generated test cases for the same story a second time (for example, first run focused on critical paths, second run on edge cases via the suggestion chips), the AI would produce overlapping titles because the outline phase had no awareness of cases that already existed.

The single-shot generator (route.ts) has always fetched folder-hierarchy context and passed it into the prompt with an explicit "do not duplicate" instruction. The newer two-phase outline → expand flow dropped that context — buildOutlineUserPrompt only saw the issue plus the free-form user notes, so a second run on the same story produced near-duplicates.

This restores parity. No UI, behavior, or API surface changes — same inputs, same flow; the outline LLM just now sees the same coverage context the single-shot generator always did.

Changes:

buildOutlineUserPrompt renders an EXISTING TEST CASES — DO NOT DUPLICATE OR SUBSTANTIALLY OVERLAP block when the context carries cases. Each entry is the title plus up to 200 chars of description (the outline only needs enough signal to recognise overlap; full step detail is unnecessary).
outline/route.ts now mirrors the single-shot route's token-budget math and calls fetchHierarchyContext before assembling the user prompt, then re-builds the prompt with the enriched context.
Four new unit tests for buildOutlineUserPrompt covering the no-existing-cases path, the rendered block, 200-char truncation, and no-description entries.

Related Issue

N/A — customer support report.

Type of Change

Bug fix (non-breaking change which fixes an issue)

Testing

4 new unit tests for buildOutlineUserPrompt.
Full unit suite: 7420 passed / 0 failed.

Checklist

My code follows the project style guidelines
I have performed a self-review of my own code
My changes generate no new warnings
I have added tests that prove my fix is effective
New and existing unit tests pass locally with my changes

…licates When a user generated test cases for the same story a second time (e.g. first run focused on critical paths, second run on edge cases via the notes suggestions), the AI would produce overlapping titles because the outline phase had no awareness of cases that already existed. The single-shot generator (route.ts) has always fetched folder-hierarchy context and passed it into the prompt with an explicit "do not duplicate" instruction. The newer two-phase outline -> expand flow dropped that context. buildOutlineUserPrompt only saw the issue plus the free-form user notes, so a second run on the same story produced near-duplicates. Restore parity: - buildOutlineUserPrompt renders an "EXISTING TEST CASES" block when the context carries cases. Each entry is the title plus up to 200 chars of description (the outline only needs enough signal to recognise overlap; full step detail is unnecessary). - outline/route.ts now mirrors the single-shot route's token-budget math and calls fetchHierarchyContext before assembling the user prompt, then re-builds the prompt with the enriched context. - Four new unit tests for buildOutlineUserPrompt covering the no-existing-cases path, the rendered block, 200-char truncation, and no-description entries. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The first pass at this PR copied the single-shot generator's 65%-of- request budget. That budget makes sense for the single-shot path (which generates entire cases in one round-trip), but it's too generous for the outline phase: at default 4096 tokens-per-request, the outline call could end up carrying ~2500 tokens of folder context plus all the steps and field values for every fetched case. Customer hit a 60s Anthropic timeout on every regeneration against a folder that already had cases. The outline LLM only needs to know WHICH titles already exist to avoid emitting overlapping ones — descriptions and steps add no useful dedup signal at the title stage. - Outline route: hardcoded 1500-token context budget instead of scaling with maxTokensPerRequest. Fits roughly 100-200 case names without bloating the prompt or pushing the request past the configured integration timeout (default 30s). - buildOutlineUserPrompt now renders titles only, no descriptions. Heading renamed to "EXISTING TEST CASE TITLES" to reflect the shape, and a comment explains why descriptions are omitted. - Updated existing-cases tests: replaced the 200-char-truncation and description-rendering tests with a single "descriptions and steps never appear" assertion that locks the contract. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

therealbrad · 2026-05-22T21:02:31Z

🎉 This PR is included in version 0.29.9 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

…#336) The fixed 1500-token budget set in PR #335 traded one problem for another: small folders worked but large ones still risked timing out, and the budget math was charging for full case content (steps + field values) while the outline prompt only renders names. With the full- content estimator, 1500 tokens fits ~7-10 cases — far fewer than the LLM could actually consume. Two changes: 1. fetchHierarchyContext gains a `mode: "names" | "full"` parameter. In "names" mode it skips loading steps and field values from the DB and bills the budget per name length only — so 1500 tokens fits ~100 case titles instead of ~7-10. Default stays "full" so the single-shot and stream generators don't change behavior. 2. The outline route runs the LLM with an adaptive context budget, modeled on the duplicate-detection split-and-retry pattern: - Start at 1500 tokens for a fresh integration. - On a clean success, grow the learned budget by 1.5x for the next call, capped at 8000 (~600 titles). - On a timeout, halve the budget and retry the same request, up to depth 3. Remember the smaller working size so the next call starts from there (then grows back up over successive successful calls). - If the halve-chain bottoms out below 100 tokens, the final attempt runs with no existing-cases context — matching pre-PR-#335 behavior, so the call always returns something. State is in-memory only, lost on restart. Test case generation has never persisted across restarts. Extracted budget helpers to outline/adaptive-budget.ts with 10 unit tests covering growth, cap, never-below-initial, per-integration isolation, convergence sequence, and isTimeoutError classification. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

therealbrad and others added 2 commits May 22, 2026 14:33

therealbrad merged commit 4b0c3a3 into main May 22, 2026
5 checks passed

therealbrad deleted the fix/outline-existing-cases-context branch May 22, 2026 20:59

therealbrad added the released label May 22, 2026

therealbrad mentioned this pull request May 22, 2026

fix(llm): adaptive context budget for outline phase to avoid timeouts #336

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(llm): pass existing folder cases into outline prompt to avoid duplicates#335

fix(llm): pass existing folder cases into outline prompt to avoid duplicates#335
therealbrad merged 2 commits into
mainfrom
fix/outline-existing-cases-context

therealbrad commented May 22, 2026

Uh oh!

Uh oh!

therealbrad commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

therealbrad commented May 22, 2026

Description

Related Issue

Type of Change

Testing

Checklist

Uh oh!

Uh oh!

therealbrad commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant