Skip to content

fix(messages): support assistant message prefill via trailing-message rewrite#261

Open
kiote wants to merge 2 commits into
ericc-ch:masterfrom
kiote:fix/assistant-prefill-trailing-message
Open

fix(messages): support assistant message prefill via trailing-message rewrite#261
kiote wants to merge 2 commits into
ericc-ch:masterfrom
kiote:fix/assistant-prefill-trailing-message

Conversation

@kiote

@kiote kiote commented Jun 8, 2026

Copy link
Copy Markdown

Problem

Some Copilot upstream models reject any /v1/messages request whose message list ends with an assistant turn (an assistant message prefill), returning a 400:

This model does not support assistant message prefill.
The conversation must end with a user message.

Anthropic clients legitimately use prefill to constrain a reply. Notably, Claude Code sends prefilled assistant turns for some operations, so those requests fail when proxied through copilot-api against newer Claude model IDs (e.g. claude-opus-4-6/4-7/4-8, claude-sonnet-4-6). Older IDs (*-4-5) happen to accept prefill, which makes the failure look model-specific and confusing.

Fix

In translateToOpenAI, after building the OpenAI message list, detect a trailing assistant message that is not an in-flight tool call, drop it, and re-inject its text as a user instruction asking the model to output only the seamless continuation.

This reproduces Anthropic's prefill contract — the response excludes the prefill text, so the client can stitch it back on — while satisfying upstream's "must end with a user message" requirement. Empty prefills fall back to a simple "Continue." nudge. Trailing assistant tool calls are left untouched, since they are part of a tool exchange rather than a prefill.

Tests

  • Adjusted the existing thinking-block test so its assistant message is no longer the final message (it was inadvertently a prefill case).
  • Added tests for: text prefill rewrite, empty prefill fallback, and the tool-call exception.

bun run typecheck, bun run lint:all, and bun test all pass (29 tests).

kiote added 2 commits June 8, 2026 19:58
Some Copilot upstream models reject a request whose message list ends
with an assistant turn ("assistant message prefill") with a 400:

  "This model does not support assistant message prefill. The
   conversation must end with a user message."

Anthropic clients such as Claude Code legitimately use prefill to
constrain a reply, so these requests fail when proxied. Detect a
trailing assistant message (that is not an in-flight tool call), drop
it, and re-inject its text as a user instruction asking the model to
emit only the seamless continuation. This reproduces Anthropic's
prefill contract (the response excludes the prefill text) while
satisfying upstream's "must end with a user message" requirement.

Add tests covering text prefill, empty prefill, and the tool-call
exception, and adjust the existing thinking-block test so its assistant
message is no longer the final message.
translateModelName collapsed claude-opus-4-8 -> claude-opus-4, but Copilot
has no bare "claude-opus-4" model (its IDs are dotted: claude-opus-4.8,
claude-sonnet-4.6, ...), so dashed IDs returned HTTP 400 model_not_supported.
Claude Code sends dashed IDs by default (ANTHROPIC_MODEL=claude-opus-4-8),
so the proxy rejected every request from it.

Rewrite the trailing "-N" minor version to ".N" across opus/sonnet/haiku,
mirroring upstream copilot-api normalization. Verified: claude-opus-4-8 and
claude-sonnet-4-6 with trailing-assistant prefill both return 200.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant