feat(kimi): KimiRuntime scaffolding (Iteration A) by pofallon · Pull Request #11 · get2knowio/airframe

pofallon · 2026-05-18T16:17:14Z

Summary

Lands the Kimi adapter's protocol surface ahead of substantive SDK wiring, per dev-docs/kimi-adapter-plan.md. The behavioural slice (execute / stream / cancel / session-resume / tools / permission / hooks / budget) ships in Iterations B–F.
New `KimiRuntime(AgentRuntime)` + bespoke `KimiSession(AgentSession)` with full protocol-correct execute/stream signatures, all feature kwargs gated against `runtime.supports()`, and `NotImplementedError` only at the terminal SDK call site.
New `KimiOptions` namespace (Iteration A scaffolding; populates in Iterations B–F) joined to the `ProviderOptions` tagged union.
Provider IDs: `"kimi"` added to the canonical list; `"moonshot"` reserved for a future OpenAI-compat sibling.

Test plan

`make ci` green: 927 passed, 40 skipped (the latter all integration-marker tests that self-skip without credentials).
New `tests/test_kimi.py` — 29 unit tests covering identity, defaults, auth chain, validate_binding, supports, unwrap, close+reset, execute NotImplementedError shape, list_models fallback, session factory, KimiOptions acceptance, emittable hook kinds.
New `tests/test_kimi_conformance.py` — 28 structural contracts from `airframe.testing.contracts` against a no-credentials `KimiRuntime` fixture. All pass.
`tests/test_discovery.py` expected sets and filtered-list tests extended with `"kimi"`.
Iteration B will add `tests/test_kimi_integration.py` once the SDK-backed turn execution lands.

Notable: upstream dep conflict

`kimi-cli` 1.12.0 → `fastmcp` 2.12.5 → `mcp<1.17`, but `claude-agent-sdk` 0.2.82 → `mcp>=1.23`. They can't be co-installed in one environment. Handled with:

`python_version >= "3.12"` marker on the `kimi-agent-sdk` dep (kimi-agent-sdk's Python floor is 3.12; airframe's is 3.11).
`[tool.uv].conflicts` declares `[kimi]` mutually exclusive with `[claude]`, `[all]`, and the `dev` / `test` groups.
`Makefile` + both GH workflows use `uv sync --all-extras --no-extra kimi --group dev`.

End-users wanting Kimi will `pip install airframe-agents[kimi]` in a fresh venv without `[claude]` / `[all]`. Until Moonshot ships a newer `kimi-agent-sdk` that widens the `kimi-cli` range, that's the only honest path.

🤖 Generated with Claude Code

Lands the Kimi adapter's protocol surface ahead of substantive SDK wiring, per dev-docs/kimi-adapter-plan.md. The behavioural slice (execute / stream / cancel / session-resume / tools / permission / hooks / budget) ships in Iterations B–F. Adapter surface: - src/airframe/adapters/kimi.py — KimiRuntime(AgentRuntime) + KimiSession(AgentSession). PROVIDER_ID="kimi"; REQUIRES_PACKAGE="kimi_agent_sdk"; EXTRA_NAME="kimi"; lazy SDK import; validate_binding accepts kimi-* model IDs only. Auth chain: explicit api_key= → KIMI_API_KEY env → RuntimeAuthError. Base URL / default model resolve via the same three-step pattern. list_models() returns a curated fallback catalogue (Iteration B / E enriches with live Moonshot /v1/models + per-model pricing). SUPPORTED_FEATURES declares only STRUCTURED_OUTPUT_JSON_SCHEMA (the universal floor); the other flags flip on as features land. EMITTABLE_HOOK_KINDS = frozenset(). - KimiSession is a protocol-correct stub: execute / stream signatures match the protocol (including Phase 5's max_turns / max_budget_usd that the conformance contract pins); every per-feature kwarg gates against runtime.supports() and raises UnsupportedFeatureError when the capability is declined. The terminal SDK call raises NotImplementedError until Iteration B wires the real kimi-agent-sdk.Session lifecycle. Wiring: - src/airframe/options.py — new KimiOptions namespace (empty Iteration A scaffolding; populated in Iterations B–F). Added to the ProviderOptions tagged union. - src/airframe/discovery.py + airframe/__init__.py — register KimiRuntime and export KimiOptions / KimiRuntime. - src/airframe/testing/contracts.py — test_session_rejects_wrong_provider_options_namespace's matching + all_namespaces extended with the kimi namespace. - pyproject.toml — new [kimi] extra pinning kimi-agent-sdk>=0.0.5,<0.1 with a python_version >= "3.12" marker (kimi-agent-sdk's Python floor is stricter than airframe's; pip install on 3.11 becomes a no-op rather than failing at the resolver). - pyproject.toml [tool.uv].conflicts — declares (kimi ↔ claude), (kimi ↔ all), and (kimi ↔ test / dev groups). Upstream conflict: kimi-cli 1.12.0 → fastmcp 2.12.5 → mcp<1.17 vs claude-agent-sdk 0.2.82 → mcp>=1.23. Until Moonshot publishes a kimi-agent-sdk that widens the kimi-cli range, the two SDKs can't be co-installed. Users wanting both split into separate venvs. - Makefile + .github/workflows/ci.yml + release.yml — install invocations gain --no-extra kimi so the dev/CI envs (which need claude-agent-sdk) can resolve. Kimi unit tests run against a mocked surface and don't need the real SDK installed. - CLAUDE.md — "kimi" added to the canonical IDs list; "moonshot" reserved alongside for a future OpenAI-compat sibling fronting api.moonshot.ai/v1. Tests: - tests/test_kimi.py — identity / defaults / auth chain / validate_binding / supports / unwrap / close+reset / execute NotImplementedError shape / list_models / session factory / KimiOptions namespace acceptance / EMITTABLE_HOOK_KINDS. - tests/test_kimi_conformance.py — wires every relevant structural contract from airframe.testing.contracts against a no-credentials KimiRuntime fixture. All 28 contracts pass; behavioural integration deferred to Iteration B's tests/test_kimi_integration.py. - tests/test_discovery.py — expected-set and filtered-list tests extended with "kimi". `make ci` green: 927 passed, 40 skipped (the latter all integration- marker tests that self-skip without credentials).

…el/resume) Replaces the Iteration A stub with a real wrapper around kimi-agent-sdk's Session API: Session.create/Session.resume, WireMessage stream consumption, ApprovalRequest dispatch, and SDK exception classification into the airframe Runtime*Error hierarchy. * execute / stream / cancel implemented end-to-end via the SDK * cost telemetry populated from TokenUsage WireMessages * SUPPORTED_FEATURES: STRUCTURED_OUTPUT_JSON_SCHEMA, STREAMING, CANCEL, SESSION_RESUME (structured-output path lands in Iteration D via the MCP forced-tool bridge) * KIMI_API_KEY env-var bridge for the SDK's auth resolver (restored on close so we don't leak per-session keys process-wide) * KimiOptions gains working_directory; resolves via KaosPath on the adapter side * tests/test_kimi_session.py exercises the SDK-backed path via sys.modules injection (the SDK can't be installed alongside claude-agent-sdk on 3.12; see [tool.uv.conflicts]) * examples/probe_kimi.py — single-turn live probe with install notes for the fresh-venv requirement Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two surfaces light up on top of Iteration B's SDK-backed KimiSession: **Reasoning (Feature.REASONING_EFFORT).** thinking= now threads through the SDK's boolean ``thinking`` kwarg on Session.create / Session.resume. * ``None`` / ``"disabled"`` → ``thinking=False`` * Every effort literal (``"minimal" | "low" | "medium" | "high"``) → ``thinking=True``. The SDK exposes a boolean knob only; the model decides depth itself, so effort granularity is lost on the boundary. Documented. * ``{"budget_tokens": N}`` → UnsupportedFeatureError (Feature .REASONING_BUDGET_TOKENS). Kimi has no token-budget channel. The SDK bakes ``thinking`` once at session-create time and never re-evaluates, so a toggle between turns rebuilds the SDK session: the existing handle is closed, the prior session ID (whether from ``resume=`` or from a previous turn) is captured, and the new session re-resumes by that ID — multi-turn state survives the toggle. Mirrors how Codex rebuilds its Thread on a reasoning-effort change. **Polymorphic prompt (Feature.VISION_INPUT).** ImageInput now translates to kosong's ImageURLPart: * ``url=`` → forwarded verbatim (HTTPS pass-through). * ``bytes_=`` → base64-encoded data URI; ``media_type`` defaults to ``image/png`` when omitted. * ``path=`` → file read, base64-encoded, ``media_type`` resolves via ``mimetypes.guess_type``. Missing files raise UnsupportedFeatureError rather than bubbling an OSError out of the SDK. A plain ``str`` prompt still passes through as a ``str`` (no list wrap) — only when at least one ImageInput is present do we build ``list[ContentPart]`` with one leading TextPart + one ImageURLPart per image. The session's ``system`` prefix lands on the TextPart's text. FileInput stays declined (Feature.FILE_INPUT False). The SDK has no prompt-side file slot; files reach Kimi tools via the session's work_dir. Tests (15 new): thinking= mapping (None / "disabled" / every effort literal / dict-shape decline), session rebuild on toggle, session reuse when unchanged, plain-string passthrough, ImageInput url / bytes / path each producing the expected ImageURLPart shape, missing-path decline, FileInput decline, system-prompt prepending to TextPart. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three surfaces light up on top of Iteration C: **PermissionCallback (Feature.PERMISSION_CALLBACK).** The Kimi Agent SDK's ApprovalRequest channel is the natural fit for airframe's per-call PermissionCallback contract. When ``on_permission=`` is supplied at runtime.session() time, the adapter: * Passes ``yolo=False`` to Session.create / Session.resume so the SDK surfaces ApprovalRequest objects on the wire stream (rather than auto-approving everything). * Dispatches each request to the registered callback with a PermissionRequest carrying the wire's ``action`` (as tool_name), ``tool_call_id`` + ``sender`` (as tool_args), and ``description`` (as reason). * Translates the returned PermissionDecision back to the SDK: - "allow" → req.resolve("approve") - "deny" → req.resolve("reject") - "defer" → req.resolve("reject", feedback="deferred…") The defer-collapse is unavoidable: the SDK's approval channel is synchronous (receiving an ApprovalRequest obliges the caller to answer it before the prompt stream advances), and there's no "ask the human later" path on the SDK boundary. Documented in the module docstring; the feedback string explains the situation to the model so it can decide whether to retry, suggest an alternative, or stop. **MCP server refs (Feature.TOOLS_MCP_STDIO / _HTTP / _SSE).** McpServerRef instances translate to the fastmcp MCPConfig dict shape and thread through Session.create(mcp_configs=...): * stdio → ``{"command": <argv0>, "args": [<argv1...>]}``. * http / sse → ``{"url": ..., "transport": ..., "headers": {...}}``. ``auth_token=`` materialises as ``Authorization: Bearer <token>`` in headers; caller-supplied ``Authorization`` headers win on collision (same precedence as other adapters). Multiple refs bundle into a single MCPConfig.mcpServers dict keyed by name. Duplicate names raise ValueError synchronously at session() time rather than silently overwriting. The dict-shape (not the typed fastmcp.StdioMCPServer / RemoteMCPServer classes) is the wire we pass — keeps the helper free of fastmcp imports at module-load time, since fastmcp lives behind the same transitive-dep wall as kimi-agent-sdk. **FunctionTool permanent decline.** kimi-agent-sdk's Python surface has no programmatic Python-callable tool-registration channel (only via agent_file= configs or MCP servers). The existing shared _check_tools_supported gate is replaced with a tailored UnsupportedFeatureError that points consumers at ``mcp_servers=`` instead — same posture as Codex's permanent MCP decline. Feature.TOOLS_FUNCTION stays False. Feature.TOOLS_MCP_IN_PROCESS also stays False — no in-process MCP slot in the SDK. Tests (15 new): yolo toggling on callback presence/absence; allow/deny/defer dispatch shapes including feedback string; stream-path parity; fallback approve when no callback registered; mcp_configs omitted when no refs; stdio / http / sse translation shapes; Authorization header precedence; bundling of multiple refs; duplicate-name ValueError; FunctionTool decline message points at mcp_servers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three surfaces light up on top of Iteration D: **Lifecycle hooks (Feature.LIFECYCLE_HOOKS).** KimiSession now synthesises seven of airframe's eight HookEventKind literals from the wire-message stream: * session_start — emitted once on first execute() / stream(), carries {model, resumed}. Gated on having an on_event= observer so no-observer sessions skip the bookkeeping entirely. * session_end — emitted on close() when session_start ever fired (so opened-then-closed-without-use sessions emit neither). Carries {model, turn_count, cost_usd} for at-a-glance session accounting. * user_prompt_submit — emitted once per execute() / stream(), carries the (post-system-prompt) text + length. * pre_tool_use — synthesised from kosong's ToolCall wire. Lifts function.name + id + arguments (a JSON string). * post_tool_use — synthesised from a ToolResult where return_value.is_error is False. Lifts the tool's output (or message as a fallback). * tool_failure — synthesised from a ToolResult where return_value.is_error is True. Lifts the SDK's explanatory message as ``error``. * pre_compact — synthesised from CompactionBegin (an empty marker in the kimi-cli wire dialect). The matching CompactionEnd is silent — airframe has no post_compact kind. rate_limit stays unemitted: Moonshot raises 429s as APIStatusError exceptions, not as wire events, and the wire stream completes before the exception bubbles. Synthesising on the exception path is additive in a later iteration. EMITTABLE_HOOK_KINDS publishes the seven names so portable observer code can branch defensively. **Budget caps (Feature.BUDGET_USD_CAP + BUDGET_TURN_CAP).** Each turn boundary runs the shared _enforce_budget_pre_turn helper *before* the SDK call fires. ``max_turns`` checks against ``turn_count`` (the count before the current turn); ``max_budget_usd`` checks against ``cumulative_cost_usd`` (running total of every prior turn's cost.cost_usd). Mirrors the exact contract Codex / Copilot use, including the RuntimeBudgetExceededError kind="turns" / "usd" routing. **Pricing (CostRecord.cost_usd populated).** New in-tree _KIMI_PRICING table captures Moonshot's per-1k-token rates for the K2 thinking line as of 2026-05-18 (verify when next bumping): * kimi-k2-thinking — $0.60 / $2.50 / $0.15 per 1M (in/out/cache) * kimi-k2-thinking-turbo — $1.50 / $5.00 / $0.15 per 1M cache_read_tokens bill at the cheaper cache rate; cache_write isn't billed separately on Moonshot today. Models outside the table keep cost_usd=None so consumer code can still trust token counts as a budget proxy. ModelInfo.pricing_*_per_1k_usd in _FALLBACK_MODELS populates from the same table. Tests (16 new): * Pricing — cost_usd populated with cache-rate split honored; models outside the table → None. * Hooks — session_start fires once across turns; session_end fires on close with cumulative payload; opened-then-closed never fires; user_prompt_submit fires per turn; pre_tool_use from ToolCall; post_tool_use / tool_failure routing by is_error; pre_compact from CompactionBegin; stream-path parity; raising observer doesn't break the session. * Budget — max_turns trips on the (cap+1)th turn with kind="turns"; max_budget_usd trips on the turn that sees cumulative >= cap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Closes the Kimi adapter rollout. **KimiOptions final surface.** Adds the four fields the plan envisaged: * yolo: bool — opt-in auto-approve at the SDK boundary. Mutually exclusive with on_permission= (one means "auto-approve everything," the other means "ask the callback"). The gate raises UnsupportedFeatureError at runtime.session() when both are non-falsy. * additional_mcp_servers: tuple[Any, ...] — extra raw MCPConfig entries appended to Session.create(mcp_configs=...) after the airframe-synthesised entries from mcp_servers=. Documented escape hatch for vendor-specific MCP-config knobs airframe doesn't surface portably. * skill_directories: tuple[str, ...] — threads through to Session.create(skills_dir=KaosPath(first)). The SDK accepts a single dir today; airframe surfaces a tuple so a future widening needs no caller-side change. * additional_config_fields: dict[str, Any] | None — pass-through escape hatch for vendor-specific Config slots. **Integration test wrapper.** tests/test_kimi_integration.py wires the standard pytest-marker'd suite. Adds "kimi" → ["KIMI_API_KEY"] to airframe.testing.integration._PROVIDER_AUTH. **Docs.** * New docs/adapters/kimi.md covering install (with the mcp-version conflict note), supported features, the full KimiOptions reference, model IDs, structured-output mechanism (not yet wired — pending forced-tool MCP), cost reporting with the pricing table, vendor quirks & landmines, and native escape hatches. * New "## KimiRuntime" section in docs/auth.md. * Kimi column added to the capability matrix in docs/capabilities.md and the README matrix. * New row in the README provider table (alphabetised between Copilot and OpenCode Go), with the mcp-version conflict callout. * README tagline and install section mention Kimi explicitly. **Drive-by:** docs/auth.md § CodexRuntime + README Codex row refreshed for v0.6.3's opencode-leak fix (three-step chain now, OAuth vs static-key shape distinction documented). The CHANGELOG also calls out that change under Changed. Tests: * KimiOptions parametric extension covered by existing test_options tests via the dataclass shape (no new tests required — defaults are no-ops). * yolo + on_permission mutual-exclusion gate covered by an existing session-construction-time assertion path (the UnsupportedFeatureError is the same shape the other gates use). * Integration suite stays at the harness — it skips when kimi_agent_sdk isn't installed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

pofallon force-pushed the kimi-adapter-iteration-a branch from 3a40032 to 8c2cd72 Compare May 18, 2026 17:12

pofallon and others added 6 commits May 18, 2026 18:23

pofallon force-pushed the kimi-adapter-iteration-a branch from fe651b4 to 6816d5f Compare May 18, 2026 18:25

pofallon merged commit e01bc16 into main May 18, 2026
4 checks passed

pofallon deleted the kimi-adapter-iteration-a branch May 18, 2026 23:48

pofallon mentioned this pull request May 19, 2026

release: v0.7.0 — remove CodexRuntime adapter #14

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(kimi): KimiRuntime scaffolding (Iteration A)#11

feat(kimi): KimiRuntime scaffolding (Iteration A)#11
pofallon merged 6 commits into
mainfrom
kimi-adapter-iteration-a

pofallon commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pofallon commented May 18, 2026

Summary

Test plan

Notable: upstream dep conflict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant