fix(llm_config): disable reasoning_effort for Opus 4.7 by juanmichelini · Pull Request #670 · OpenHands/benchmarks

juanmichelini · 2026-04-17T05:48:32Z

Summary

Fixes the Opus 4.7 LLM failure where Opus 4.6 works with the same setup.

Root Cause

LiteLLM handles reasoning_effort differently for Opus 4.6 vs 4.7:

Opus 4.6: reasoning_effort="high" maps to type="adaptive" (model decides thinking budget)
Opus 4.7: reasoning_effort="high" maps to type="enabled" with budget_tokens=4096 (fixed)

This causes unexpected behavior for 4.7 (excessive thinking, token limit issues) while 4.6 works correctly.

The SDK sets reasoning_effort="high" by default for all models. When this is passed to LiteLLM, 4.7 gets the problematic fixed budget mapping while 4.6 gets the adaptive type which works fine.

Evidence

From LiteLLM's code (llms/anthropic/chat/transformation.py):

def _map_reasoning_effort(...):
    if AnthropicConfig._is_claude_4_6_model(model):
        return AnthropicThinkingParam(type="adaptive")  # 4.6 - works
    elif reasoning_effort == "high":
        return AnthropicThinkingParam(
            type="enabled",
            budget_tokens=DEFAULT_REASONING_EFFORT_HIGH_THINKING_BUDGET,  # 4096 for 4.7 - breaks
        )

Fix

In benchmarks/utils/llm_config.py, disable reasoning_effort for Opus 4.7 models by setting it to None. This allows them to use default behavior without the problematic fixed budget mapping.

Verification

Opus 4.7 after fix:
  reasoning_effort: None
  extended_thinking_budget: 200000

Opus 4.6 (unchanged):
  reasoning_effort: high
  extended_thinking_budget: 200000

The fix is minimal and targeted - it only affects Opus 4.7 and doesn't impact other models.

@juanmichelini can click here to continue refining the PR

LiteLLM handles reasoning_effort='high' differently for Opus 4.6 vs 4.7: - Opus 4.6: maps to type='adaptive' (model decides thinking budget) - Opus 4.7: maps to type='enabled' with fixed budget_tokens=4096 This causes unexpected behavior for 4.7 (excessive thinking, token limit issues) while 4.6 works correctly. The fix disables reasoning_effort for Opus 4.7 models, allowing them to use the default behavior without the problematic fixed budget mapping. Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot

🟡 Acceptable - Targeted workaround for a real LiteLLM bug. The fix works but could be simplified. Missing concrete evidence of before/after behavior.

all-hands-bot · 2026-04-17T05:50:19Z

+    if model_matches(llm.model, OPUS_4_7_MODELS) and llm.reasoning_effort is not None:
+        llm = LLM(
+            **{
+                **llm.model_dump(),
+                "reasoning_effort": None,
+            }


🟡 Suggestion: Pydantic models support model.copy(update={...}) which is cleaner than manual dict unpacking:

Suggested change

if model_matches(llm.model, OPUS_4_7_MODELS) and llm.reasoning_effort is not None:

llm = LLM(

**{

**llm.model_dump(),

"reasoning_effort": None,

}

if model_matches(llm.model, OPUS_4_7_MODELS) and llm.reasoning_effort is not None:

llm = llm.model_copy(update={"reasoning_effort": None})

This avoids the nested dict unpacking and is more idiomatic for Pydantic models.

all-hands-bot · 2026-04-17T05:50:19Z

+from openhands.sdk.llm.utils.model_features import model_matches
+
+
+# Models where LiteLLM handles reasoning_effort incorrectly.
+# LiteLLM maps reasoning_effort="high" to type="adaptive" for 4.6 but to
+# type="enabled" with fixed budget_tokens=4096 for 4.7, causing issues.


🟡 Suggestion: Add a comment linking to the upstream LiteLLM issue (if one exists) so we know when this workaround can be removed:

# Models where LiteLLM handles reasoning_effort incorrectly. # TODO: Remove this workaround once LiteLLM fixes the mapping. # See: https://github.com/BerriAI/litellm/issues/XXXXX # LiteLLM maps reasoning_effort="high" to type="adaptive" for 4.6 but to # type="enabled" with fixed budget_tokens=4096 for 4.7, causing issues. OPUS_4_7_MODELS = [ "claude-opus-4-7", ]

If no issue exists, consider filing one to track this upstream.

juanmichelini · 2026-05-01T01:28:10Z

No longer needed.

all-hands-bot reviewed Apr 17, 2026

View reviewed changes

juanmichelini closed this May 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(llm_config): disable reasoning_effort for Opus 4.7#670

fix(llm_config): disable reasoning_effort for Opus 4.7#670
juanmichelini wants to merge 1 commit intomainfrom
fix/opus-4-7-reasoning-effort

juanmichelini commented Apr 17, 2026

Uh oh!

all-hands-bot left a comment

Uh oh!

all-hands-bot Apr 17, 2026

Uh oh!

all-hands-bot Apr 17, 2026

Uh oh!

juanmichelini commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

juanmichelini commented Apr 17, 2026

Summary

Root Cause

Evidence

Fix

Verification

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

all-hands-bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

all-hands-bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

juanmichelini commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants