Skip to content

Add models command, formalize /responses support, harden defaults#263

Open
RazonIn4K wants to merge 9 commits into
ericc-ch:masterfrom
RazonIn4K:feat/models-responses-hardening
Open

Add models command, formalize /responses support, harden defaults#263
RazonIn4K wants to merge 9 commits into
ericc-ch:masterfrom
RazonIn4K:feat/models-responses-hardening

Conversation

@RazonIn4K

Copy link
Copy Markdown

Add models command, formalize /responses support, harden defaults

This branch is 9 commits ahead of master. It adds a non-interactive
models command, formalizes and hardens the /responses adapter the chat
handler already depended on, makes the server local-only by default, and fixes
documentation, Docker, and CI drift.

Summary of changes

Features

  • models CLI command (e273b3f) — inspect the current Copilot models and
    their supported endpoints without starting the server. Useful for
    non-interactive deployments. Supports --json, --github-token,
    --account-type, --show-token, and --proxy-env.
  • /responses endpoint routing (fb77680) — models that only advertise
    /responses (and not /chat/completions) are now transparently routed
    through a Responses API adapter that converts both non-streaming and streaming
    responses back into Chat Completions shape.
  • Tool calls over /responses (a695374) — chat-completions tools and
    tool_choice are mapped into the Responses API shape, assistant tool_calls
    and tool results become function_call/function_call_output input items, and
    function_call outputs and function_call_arguments deltas are translated
    back into tool_calls (non-streaming and streaming). Replaces the earlier
    drop-with-warning behavior.
  • Local-only by default (44c74c7) — the server now binds to 127.0.0.1
    unless --host is passed, so it is not reachable from other machines out of
    the box. A warning prints when binding to a non-local host. The Docker
    entrypoint passes --host 0.0.0.0 so published ports keep working.

Fixes

  • Lazy VS Code version fetch (3d24eef) — the VS Code version is now fetched
    only when called, not as an import-time network side effect.
  • Tolerate invalid tool-call JSON (dec3694) — malformed tool arguments from
    Copilot no longer crash translateToAnthropic; the tool input falls back to
    an empty object.
  • Harden /responses stream adapter (e8f316c) — malformed or non-JSON SSE
    events (e.g. keepalives) are skipped instead of killing the stream, and a
    warning is logged when tool definitions/calls are dropped for
    /responses-only models.

Tests

  • Anthropic edge cases (dec3694) — images, tool_result ordering
    (including multiple results per message), mixed text/tool streaming block
    transitions, invalid tool JSON, and cache-token accounting.
  • /responses adapter (fb77680, e8f316c) — endpoint selection,
    non-streaming conversion, streaming event conversion, malformed-event
    skipping, and unknown-event handling.
  • count_tokens handler (ddede24) — drives the real Hono route and locks
    the claude (1.15) and grok (1.03) multipliers, the 346/480-token tool
    overhead, the mcp__/claude-code beta exemption, and the invalid-JSON
    fallback.

Docs / chore

  • Non-root Docker (58249ba) — Dockerfile runs as USER bun; README
    volume path corrected to /home/bun/.local/share/copilot-api.
  • Docs driftAGENTS.md updated (tsuptsdown, corrected test
    command); README documents the models command, the --host flag, and
    single-account/local-use assumptions.
  • GitHub Pages workflow — now deploys from master (the branch this fork and
    upstream actually expose).

Verification

  • bun run lint:all — passing
  • bun run typecheck — passing
  • bun test60 passing (was 29)
  • bun run build — passing

Notes for the maintainer

  • The /responses adapter now forwards function tool calls, but image content
    is still stringified ([image: <url>]) rather than sent as a structured
    input-image part. The tool mapping was implemented against the documented
    OpenAI Responses API shapes; it should be validated against a live
    /responses-only Copilot model, since Copilot's variant may differ in field
    names (e.g. call_id vs id).
  • Docker bind-mount permissions: under USER bun, the host directory mounted at
    /home/bun/.local/share/copilot-api must be writable by uid 1000. If you hit
    an EACCES on first run, chown 1000:1000 ./copilot-data on the host.

Razon added 9 commits June 12, 2026 00:50
Server is now local-only unless --host is explicitly passed.
Docker entrypoint passes --host 0.0.0.0 so published ports keep working.
Covers images, tool_result ordering, mixed text/tool streams,
invalid tool JSON, and cache token accounting.
- Skip malformed or non-JSON SSE events instead of crashing the stream
- Warn when tool definitions/calls are dropped for /responses-only models
Drives the real Hono route and asserts the claude (1.15) and grok (1.03)
multipliers, the 346/480-token tool overhead, the mcp__/claude-code beta
exemption, and the invalid-JSON fallback.
Map chat-completions tools/tool_choice into the Responses API shape, convert
assistant tool_calls + tool results into function_call/function_call_output
input items, and translate function_call output items and
function_call_arguments deltas back into chat-completions tool_calls (non-stream
and streaming). Replaces the previous drop-with-warning behavior.
Copilot AI review requested due to automatic review settings June 13, 2026 17:08

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds support for routing certain models through the Copilot Responses endpoint while preserving an OpenAI-compatible chat completions surface, and improves CLI/deployment ergonomics and edge-case handling.

Changes:

  • Add Responses API adapter (payload translation + streaming event → chat chunk conversion) and route selection based on model supported endpoints.
  • Add models CLI subcommand and enhance server startup options with --host (plus Docker binding + docs updates).
  • Add tests covering Responses adapter, token counting logic, and Anthropic translation edge-cases (tools, images, cached tokens, streaming behavior).

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/create-responses.test.ts Adds unit tests for Responses→ChatCompletion conversion and streaming chunk mapping (text + tool_calls).
tests/count-tokens-handler.test.ts Adds tests for /count_tokens multiplication/overhead rules and invalid JSON fallback.
tests/anthropic-edge-cases.test.ts Adds edge-case tests for Anthropic↔OpenAI translations (images, tool_result ordering, cached tokens, stream block boundaries).
src/services/copilot/create-responses.ts Implements Responses endpoint adapter, including payload conversion, tool forwarding, and SSE event mapping.
src/routes/chat-completions/handler.ts Switches to Responses endpoint for models that only support /responses, including streaming bridging.
src/services/copilot/get-models.ts Extends model shape with supported_endpoints to drive endpoint selection.
src/routes/messages/non-stream-translation.ts Prevents crashes on malformed tool-call JSON by safely parsing tool arguments.
src/start.ts Adds --host bind option, updates server URL display, and warns on non-local binding.
src/models.ts Adds models subcommand to list available Copilot models (optionally JSON).
src/main.ts Registers new models subcommand.
package.json Adds bun run models script.
Dockerfile / entrypoint.sh Runs as bun user; binds server to 0.0.0.0 inside container to support published ports.
README.md / AGENTS.md Updates docs for Docker paths, CLI usage, models command, and build/test notes.
.github/workflows/deploy-pages.yml Changes Pages deploy trigger branch to master.
src/services/get-vscode-version.ts Removes stray top-level invocation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/start.ts
Comment on lines +68 to +75
const displayHost = options.host === "0.0.0.0" ? "localhost" : options.host
const serverUrl = `http://${displayHost}:${options.port}`

if (options.host !== "127.0.0.1" && options.host !== "localhost") {
consola.warn(
`Server will listen on ${options.host} and may be reachable from other machines. Use the default host (127.0.0.1) for local-only access.`,
)
}
Comment on lines 4 to +5
push:
branches: [ "main" ]
branches: [master]
Comment on lines +15 to +19
import {
translateToAnthropic,
translateToOpenAI,
} from "../src/routes/messages/non-stream-translation"
import { translateChunkToAnthropicEvents } from "../src/routes/messages/stream-translation"
Comment on lines +108 to +110
const isNonStreamingResponse = (
response: Awaited<ReturnType<typeof createResponsesFromChatCompletions>>,
): response is ResponseApiResponse => !(Symbol.asyncIterator in response)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants