fix(llm): adaptive context budget for outline phase to avoid timeouts by therealbrad · Pull Request #336 · TestPlanIt/testplanit

therealbrad · 2026-05-22T22:09:26Z

Description

Follow-up to PR #335. The fixed 1500-token outline context budget shipped in #335 traded one problem for another:

Budget math was over-charging. fetchHierarchyContext bills the budget per case including all steps and field-value text. The outline prompt only renders titles. Net effect: 1500 tokens fit only ~7–10 cases instead of the ~100 titles the LLM could actually consume.
Large folders still risked timing out. A fixed budget can't adapt to a slow integration or a model that's having a bad minute.

This PR makes the outline phase forgive both — and patterns it after the existing duplicate-detection / auto-tag split-and-retry logic so the codebase keeps one shape for this kind of thing.

Two changes:

fetchHierarchyContext gains a mode: "names" | "full" parameter.
- "names" mode skips loading steps + field values from the DB and bills the budget per name length only. ~100 titles in 1500 tokens instead of ~7–10 cases.
- Default stays "full" so the single-shot and stream generators don't change behavior.
Outline route runs the LLM with an adaptive context budget. Modeled on duplicate-analysis.service.ts:149-224:
- Start at 1500 tokens for a fresh integration.
- On a clean success → grow the learned budget by 1.5× for next call, capped at 8000 (~600 titles).
- On a timeout → halve the budget and retry the same request, up to depth 3. Save the smaller working size so the next call starts there.
- If the halve-chain bottoms out below 100 tokens → final attempt with no existing-cases context. Matches pre-fix(llm): pass existing folder cases into outline prompt to avoid duplicates #335 behavior, so the call always returns something.

In-memory state only, lost on restart. Test case generation has never persisted across restarts.

Convergence example (fast integration, 6 successful calls in a session): 1500 → 2250 → 3375 → 5063 → 7595 → 8000 (cap) → 8000

Worst case (doomed call on a very slow integration): 4 × timeout total before falling back to no context.

Related Issue

N/A — direct follow-up to a customer report addressed in PR #335.

Type of Change

Bug fix (non-breaking change which fixes an issue)
Performance / reliability

Testing

10 new unit tests for the adaptive-budget helpers in outline/adaptive-budget.test.ts:
- getStartingBudget initial, growth, cap, never-below-initial, learned-zero recovery, per-integration isolation, and the full convergence sequence (1500 → 2250 → 3375 → 5063 → 7595 → 8000)
- isTimeoutError recognises both the LlmError { code: "TIMEOUT" } shape and plain Error("Request timeout") messages
Full unit suite: 7429 passed / 0 failed (+10 vs main).

Checklist

My code follows the project style guidelines
I have performed a self-review of my own code
My changes generate no new warnings
I have added tests that prove my fix is effective
New and existing unit tests pass locally with my changes

The fixed 1500-token budget set in PR #335 traded one problem for another: small folders worked but large ones still risked timing out, and the budget math was charging for full case content (steps + field values) while the outline prompt only renders names. With the full- content estimator, 1500 tokens fits ~7-10 cases — far fewer than the LLM could actually consume. Two changes: 1. fetchHierarchyContext gains a `mode: "names" | "full"` parameter. In "names" mode it skips loading steps and field values from the DB and bills the budget per name length only — so 1500 tokens fits ~100 case titles instead of ~7-10. Default stays "full" so the single-shot and stream generators don't change behavior. 2. The outline route runs the LLM with an adaptive context budget, modeled on the duplicate-detection split-and-retry pattern: - Start at 1500 tokens for a fresh integration. - On a clean success, grow the learned budget by 1.5x for the next call, capped at 8000 (~600 titles). - On a timeout, halve the budget and retry the same request, up to depth 3. Remember the smaller working size so the next call starts from there (then grows back up over successive successful calls). - If the halve-chain bottoms out below 100 tokens, the final attempt runs with no existing-cases context — matching pre-PR-#335 behavior, so the call always returns something. State is in-memory only, lost on restart. Test case generation has never persisted across restarts. Extracted budget helpers to outline/adaptive-budget.ts with 10 unit tests covering growth, cap, never-below-initial, per-integration isolation, convergence sequence, and isTimeoutError classification. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(llm): adaptive context budget for outline phase to avoid timeouts#336

fix(llm): adaptive context budget for outline phase to avoid timeouts#336
therealbrad wants to merge 1 commit into
mainfrom
fix/outline-adaptive-context-budget

therealbrad commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

therealbrad commented May 22, 2026

Description

Related Issue

Type of Change

Testing

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant