Skip to content

fix(llm): honor configured max-output-tokens in outline endpoint#334

Merged
therealbrad merged 1 commit into
mainfrom
fix/outline-tokens-cap
May 22, 2026
Merged

fix(llm): honor configured max-output-tokens in outline endpoint#334
therealbrad merged 1 commit into
mainfrom
fix/outline-tokens-cap

Conversation

@therealbrad
Copy link
Copy Markdown
Contributor

Description

The two-phase test-case generation outline route silently capped maxTokens at 1024:

let maxTokens = resolvedPrompt.maxOutputTokens ?? 1024;
const providerConfig = await (prisma as any).llmProviderConfig.findFirst({ ... });
if (providerConfig) {
  maxTokens = Math.min(providerConfig.defaultMaxTokens ?? 1024, 1024);
}

The Math.min(..., 1024) overrode whatever value the user set in LLM settings. When a user requested "Many" outlines, the response often exceeded 1024 tokens, causing the JSON to truncate mid-array. The resulting parse error (improved in #327) suggested "Raise the model's max-output-tokens in LLM settings" — a setting the route ignored.

This PR:

  • Drops the hard 1024 ceiling; honors providerConfig.defaultMaxTokens and falls back to resolvedPrompt.maxOutputTokens, matching the resolution order already used by the /expand route.
  • Bumps the no-config fallback from 1024 → 2048 so "Many" outlines fit out of the box. Outlines are titles + summaries only, much smaller than the expand route's 4096 default.

The cap has been there since the two-phase split in #304; #327 only made the symptom user-visible.

Related Issue

N/A — surfaced via user report after #327 added parse-failure diagnostics.

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Refactoring (no functional changes)
  • Performance improvement

How Has This Been Tested?

  • Unit tests
  • Integration tests
  • E2E tests
  • Manual testing

Test Configuration:

  • OS: macOS (darwin 25.4.0)
  • Browser (if applicable): N/A (server-side route)
  • Node version: project pinned

Reproduced the truncation locally with "Many" outlines hitting finishReason: length at ~4400 chars; after the fix, defaultMaxTokens configured in LLM settings is honored and the response completes.

Checklist

  • My code follows the project's style guidelines
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published
  • I have signed the CLA

Additional Notes

Heads up: main currently has unrelated TypeScript errors in ImportCasesWizard.tsx introduced by #332. Those are not from this change and will need a separate fix for CI to go green here.

The two-phase test-case generation outline route silently capped
maxTokens at 1024 via Math.min(providerConfig.defaultMaxTokens ?? 1024,
1024), overriding whatever value the user set in LLM settings. When a
user requested "Many" outlines, the response often exceeded 1024
tokens, causing the JSON to truncate mid-array. The resulting parse
error suggested raising max-output-tokens — a setting the route
ignored.

- Drop the hard 1024 ceiling; honor providerConfig.defaultMaxTokens and
  fall back to resolvedPrompt.maxOutputTokens, matching the expand
  route's resolution order.
- Bump the no-config fallback from 1024 to 2048 so "Many" outlines fit
  out of the box.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@therealbrad therealbrad merged commit a27e63e into main May 22, 2026
5 checks passed
@therealbrad therealbrad deleted the fix/outline-tokens-cap branch May 22, 2026 15:57
@therealbrad
Copy link
Copy Markdown
Contributor Author

🎉 This PR is included in version 0.29.8 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant