Skip to content

Launch 1M-capable Claude models with the [1m] window suffix#8

Merged
OnlyTerp merged 1 commit into
OnlyTerp:mainfrom
strandborg:fix/launcher-1m-context-window
Jun 3, 2026
Merged

Launch 1M-capable Claude models with the [1m] window suffix#8
OnlyTerp merged 1 commit into
OnlyTerp:mainfrom
strandborg:fix/launcher-1m-context-window

Conversation

@strandborg

Copy link
Copy Markdown
Contributor

Problem

Picking Opus 4.8 (or Sonnet 4.6) through the launcher, the /context meter
fills ~5× faster than expected and pins at 100%, and auto-compaction is keyed to
the wrong limit. /context shows the window as … / 200k, not / 1M:

claude-opus-4-8
64.1k/200k tokens (32%)

Root cause

Claude Code only switches its context meter and auto-compaction to the 1M
window (and sends the context-1m beta) when the session's model id carries the
[1m] suffix (e.g. claude-opus-4-8[1m]). The selector and config.json
advertise bare ids (claude-opus-4-8), and the launchers pass those straight
to claude --model, so the client defaults to the 200k window — even though
Opus 4.8 / 4.7 / 4.6 and Sonnet 4.6 serve 1M natively on the Anthropic API.

Nothing is lost upstream; the window is just mis-sized client-side. Verified that
claude --model 'claude-opus-4-8[1m]' is accepted and, after this change,
/context reads / 1M.

Fix

Both launchers now append [1m] to 1M-capable Claude base ids before launch, for
the selector pick and the default settings model:

  • bin/ultracodeuc_add_1m helper
  • windows/Start-UltraCode.ps1Add-Uc1m function

Scoped so Haiku 4.5, claude-auto, and non-Claude routes
(Gemini / GPT / Composer) are never suffixed.

Config:

  • UC_FORCE_1M=0 — disable (back to bare ids)
  • UC_1M_MODELS — comma-separated override of the capable set
    (default claude-opus-4-8,claude-opus-4-7,claude-opus-4-6,claude-sonnet-4-6)

Testing

  • bash -n bin/ultracode clean; uc_add_1m unit-tested across opus/sonnet (→
    suffixed), haiku/auto/gemini/already-suffixed/empty (→ unchanged), and
    UC_FORCE_1M=0 (→ disabled).
  • windows/Start-UltraCode.ps1 reviewed (${ModelId}[1m] interpolation; no
    pwsh available in the dev env to run it).
  • End-to-end: relaunched, picked Opus 4.8, confirmed /context now shows / 1M.

Note

If an Anthropic-passthrough deployment can fall back to a backend capped at 200k,
a conversation that grows past 200k may then fail there — the new
TROUBLESHOOTING entry calls this out.

🤖 Generated with Claude Code

Claude Code only switches its context meter and auto-compaction to the 1M
window (and sends the context-1m beta) when the session model id carries the
[1m] suffix. The selector and config.json advertise bare ids (claude-opus-4-8),
so the launchers started Claude Code on a 200k window even for Opus 4.8 /
Sonnet 4.6, which serve 1M natively -- the /context meter filled ~5x too fast
and pinned at 100%, and auto-compaction was keyed to the wrong limit.

Append [1m] to 1M-capable Claude base ids in both the POSIX (bin/ultracode)
and Windows (Start-UltraCode.ps1) launchers, for the selector pick and the
default settings model. Scoped so Haiku 4.5, claude-auto, and non-Claude
routes (Gemini/GPT/Composer) are never suffixed. Configurable via UC_FORCE_1M
(set 0 to disable) and UC_1M_MODELS (override the capable set). Symptom and
toggles documented in docs/TROUBLESHOOTING.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@OnlyTerp OnlyTerp merged commit 4210254 into OnlyTerp:main Jun 3, 2026
4 checks passed
OnlyTerp pushed a commit that referenced this pull request Jun 3, 2026
Companion to #8. That PR makes the launchers append a "[1m]" suffix to 1M-capable
Claude model ids so Claude Code sizes its context meter (and auto-compaction) to
the 1M window. The suffix is a client-side convention -- the 1M window itself is
carried by the context-1m beta header -- not an Anthropic model id.

The proxy looks up routes (and orchestrator/worker picks) by exact id, so once a
pick carries "[1m]" a configured route like "claude-opus" no longer matches
"claude-opus[1m]" and routing falls through / breaks. This strips a trailing
"[1m]" from the request model id up front, so "<id>[1m]" behaves exactly like
"<id>" everywhere downstream. The 1M window is unaffected (the beta header is left
untouched), and the strip is universal so stock-traffic remap and the Auto Router
stay consistent too.

Depends on #8: with #8 not merged nothing emits the suffix, so this is a no-op on
its own and only earns its keep once #8 lands.

Includes a unit test.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
OnlyTerp pushed a commit that referenced this pull request Jun 11, 2026
Follow-up to #8 (launcher appends [1m]) and #10 (proxy strips [1m] before
routing). Those give the 1M context window to a launch-time pick of a stock id,
but an in-session /model switch to a CONFIGURED real-Claude route -- e.g. the
shipped `claude-opus` route, which maps to claude-opus-4-8 -- used the bare
gateway id, so /context showed 200k and auto-compaction was mis-keyed.

Claude Code sizes its context meter to 1M only when the model id it holds carries
the [1m] suffix (verified: it honors the suffix on a custom gateway id, not just
native ids). So the proxy now ADVERTISES the suffix on /v1/models + /healthz for
real-Claude PASSTHROUGH routes whose upstream model is 1M-capable. The /model
picker id then carries [1m] and the 1M window engages even on in-session switches.

- The suffix is stripped before routing (the inline strip from #10 is refactored
  into a shared _strip_1m helper) and normalized off the sticky orchestrator/worker
  selection, so internal route ids stay clean (claude-opus[1m] -> claude-opus).
- Scope: real-Claude passthrough routes only. Worker ("Worker -> ...") entries and
  non-passthrough routes (openai_compat / codex / cursor) are never suffixed.
- Launcher: add `claude-opus` (the shipped route) to the UC_1M_MODELS default so a
  launch/selector pick of it matches the /model behavior.

Toggles: UC_ADVERTISE_1M=0 (off), UC_1M_UPSTREAM (the 1M-capable upstream set).
Verified live: picking claude-opus via /model now shows /context = / 1M. Tests +
doctor pass; TROUBLESHOOTING updated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants