Skip to content

Improve parameter flexibility#160

Merged
Stephen Belanger (Qard) merged 1 commit into
mainfrom
parameter-flexibility
Jan 13, 2026
Merged

Improve parameter flexibility#160
Stephen Belanger (Qard) merged 1 commit into
mainfrom
parameter-flexibility

Conversation

@Qard

@Qard Stephen Belanger (Qard) commented Jan 12, 2026

Copy link
Copy Markdown
Contributor

This makes max_tokens configurable and makes both it and temperature fallback to model-provided defaults otherwise.

Fixes #149

@github-actions

github-actions Bot commented Jan 12, 2026

Copy link
Copy Markdown

Braintrust eval report

Autoevals (parameter-flexibility-1768263664)

Score Average Improvements Regressions
NumericDiff 72.8% (-1pp) - 2 🔴
Time_to_first_token 1.34tok (-0.12tok) 112 🟢 7 🔴
Llm_calls 1.55 (+0) - -
Tool_calls 0 (+0) - -
Errors 0 (+0) - -
Llm_errors 0 (+0) - -
Tool_errors 0 (+0) - -
Prompt_tokens 279.25tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -
Completion_tokens 19.3tok (+0tok) - -
Completion_reasoning_tokens 0tok (+0tok) - -
Total_tokens 298.54tok (+0tok) - -
Estimated_cost 0$ (+0$) - -
Duration 2.51s (+1s) 114 🟢 105 🔴
Llm_duration 2.78s (-0.25s) 114 🟢 5 🔴

@github-actions

Copy link
Copy Markdown

Braintrust eval report

Autoevals (parameter-flexibility-1768258553)

Score Average Improvements Regressions
NumericDiff 71.6% (-2pp) 1 🟢 7 🔴
Time_to_first_token 1.38tok (+0.05tok) 40 🟢 79 🔴
Llm_calls 1.55 (+0) - -
Tool_calls 0 (+0) - -
Errors 0 (+0) - -
Llm_errors 0 (+0) - -
Tool_errors 0 (+0) - -
Prompt_tokens 279.25tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -
Completion_tokens 19.3tok (+0tok) - -
Completion_reasoning_tokens 0tok (+0tok) - -
Total_tokens 298.54tok (+0tok) - -
Estimated_cost 0$ (+0$) - -
Duration 2.77s (+1.37s) 17 🟢 201 🔴
Llm_duration 2.87s (+0.08s) 28 🟢 89 🔴

@ibolmo Olmo Maldonado (ibolmo) left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick tests would be good, but looks reasonable.

This makes max_tokens configurable and makes both it and
temperature fallback to model-provided defaults otherwise.

@ibolmo Olmo Maldonado (ibolmo) left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ty!

@Qard Stephen Belanger (Qard) merged commit df6af22 into main Jan 13, 2026
7 checks passed
@Qard Stephen Belanger (Qard) deleted the parameter-flexibility branch January 13, 2026 17:04
@github-actions

github-actions Bot commented Jan 13, 2026

Copy link
Copy Markdown

Braintrust eval report

Autoevals (main-1768323886)

Score Average Improvements Regressions
NumericDiff 72.8% (-1pp) - 2 🔴
Time_to_first_token 1.38tok (-0.05tok) 88 🟢 28 🔴
Llm_calls 1.55 (+0) - -
Tool_calls 0 (+0) - -
Errors 0 (+0) - -
Llm_errors 0 (+0) - -
Tool_errors 0 (+0) - -
Prompt_tokens 279.25tok (+0tok) - -
Prompt_cached_tokens 0tok (+0tok) - -
Prompt_cache_creation_tokens 0tok (+0tok) - -
Completion_tokens 19.3tok (+0tok) - -
Completion_reasoning_tokens 0tok (+0tok) - -
Total_tokens 298.54tok (+0tok) - -
Estimated_cost 0$ (+0$) - -
Duration 3.56s (+2.12s) 96 🟢 123 🔴
Llm_duration 2.8s (-0.12s) 95 🟢 23 🔴

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Factuality hits json parse errors caused by exceeding token limit

2 participants