Skip to content

feat: Exposing cache control API in OpenAI and Anthropic API's#486

Open
wookievx wants to merge 1 commit into
softwaremill:masterfrom
wookievx:cache-keys-and-retention-support
Open

feat: Exposing cache control API in OpenAI and Anthropic API's#486
wookievx wants to merge 1 commit into
softwaremill:masterfrom
wookievx:cache-keys-and-retention-support

Conversation

@wookievx

Copy link
Copy Markdown
Contributor

Adding support for OpenAI chat-completions caching related request parameters and Anthropic create message cache_control parameters.

@wookievx wookievx force-pushed the cache-keys-and-retention-support branch from 3554b2a to 5bf3da4 Compare June 24, 2026 13:45
Adding support for OpenAI chat-completions caching related request parameters and Anthropic create message cache_control parameters.
@wookievx wookievx force-pushed the cache-keys-and-retention-support branch from 5bf3da4 to 4998f71 Compare June 24, 2026 13:52
@adamw

adamw commented Jun 25, 2026

Copy link
Copy Markdown
Member

Review: Exposing cache control API in OpenAI and Anthropic APIs

Overall: Mostly correct. I verified the wire shapes against the Anthropic and OpenAI docs: the request-level cache_control (Anthropic supports a top-level auto-placement param), the content-block cache_control placements, the {"type":"ephemeral","ttl":...} JSON shape, and the OpenAI "24h" / "in_memory" values are all right. Notably "in_memory" (underscore) matches what the OpenAI API actually expects — see openai-python#2883, where the SDK type wrongly used a hyphen. The one real bug is tool-level cache-control placement.

Findings (most-severe first)

  1. claude/src/main/scala/sttp/ai/claude/models/Tool.scala — tool cache_control is nested inside input_schema instead of placed on the tool object, so tool prompt-caching silently never takes effect. cacheControl was added to ToolInputSchema, and Tool.customRW serializes it under input_schema{"name":...,"description":...,"input_schema":{"type":"object",...,"cache_control":{"type":"ephemeral"}}}. The Anthropic API expects cache_control as a sibling of input_schema on the tool object, not inside the JSON schema. A user who sets it to create a tool cache breakpoint gets no caching (the nested key is ignored as an unknown schema field). The round-trip unit test passes because uPickle reads back whatever it writes — it doesn't validate against the real API shape.

  2. README.md (Advanced Parameters example, ~line 290) — the example no longer compiles: missing comma before the new cacheControl line. The diff inserts cacheControl = Some(CacheControl.Ephemeral()) directly after tools = Some(tools) with no trailing comma on the prior line, producing two named arguments with no separator. (CacheControl is also unimported in that snippet.) It's a plain ```scala block, not mdoc:compile-only, so CI won't catch it — but anyone copying the example hits a syntax error.

  3. claude/src/main/scala/sttp/ai/claude/models/ContentBlock.scalaWebSearchResult.cacheControl is dead API surface. WebSearchResult is a response-only sub-result (returned inside WebSearchToolResultContent), never sent in a request, so a cache_control field on it can never do anything. Minor, but it widens the public API with a meaningless option and invites confusion about where caching is actually configurable.

  4. claude/src/main/scala/sttp/ai/claude/models/Usage.scalatotalTokens semantics changed. It now returns inputTokens + cacheReadInputTokens + cacheCreationInputTokens + outputTokens rather than inputTokens + outputTokens. Reasonable (and totalInputTokens is a useful addition), but it's a behavioral change to an existing public method — any caller relying on the old value will see different numbers once caching is in play. Worth a changelog note.

Findings 1 and 2 are worth fixing before merge; 3 and 4 are judgment calls for the author.

🤖 Review generated with Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants