fix(llm): honor configured max-output-tokens in outline endpoint#334
Merged
Conversation
The two-phase test-case generation outline route silently capped maxTokens at 1024 via Math.min(providerConfig.defaultMaxTokens ?? 1024, 1024), overriding whatever value the user set in LLM settings. When a user requested "Many" outlines, the response often exceeded 1024 tokens, causing the JSON to truncate mid-array. The resulting parse error suggested raising max-output-tokens — a setting the route ignored. - Drop the hard 1024 ceiling; honor providerConfig.defaultMaxTokens and fall back to resolvedPrompt.maxOutputTokens, matching the expand route's resolution order. - Bump the no-config fallback from 1024 to 2048 so "Many" outlines fit out of the box. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
Author
|
🎉 This PR is included in version 0.29.8 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
The two-phase test-case generation outline route silently capped
maxTokensat 1024:The
Math.min(..., 1024)overrode whatever value the user set in LLM settings. When a user requested "Many" outlines, the response often exceeded 1024 tokens, causing the JSON to truncate mid-array. The resulting parse error (improved in #327) suggested "Raise the model's max-output-tokens in LLM settings" — a setting the route ignored.This PR:
providerConfig.defaultMaxTokensand falls back toresolvedPrompt.maxOutputTokens, matching the resolution order already used by the/expandroute.The cap has been there since the two-phase split in #304; #327 only made the symptom user-visible.
Related Issue
N/A — surfaced via user report after #327 added parse-failure diagnostics.
Type of Change
How Has This Been Tested?
Test Configuration:
Reproduced the truncation locally with "Many" outlines hitting
finishReason: lengthat ~4400 chars; after the fix,defaultMaxTokensconfigured in LLM settings is honored and the response completes.Checklist
Additional Notes
Heads up:
maincurrently has unrelated TypeScript errors inImportCasesWizard.tsxintroduced by #332. Those are not from this change and will need a separate fix for CI to go green here.