feat: add requestTransform for deterministic matching and recording by iskhakovt · Pull Request #63 · CopilotKit/aimock

iskhakovt · 2026-03-30T17:39:43Z

Problem

LLM prompts often contain dynamic data (timestamps, UUIDs, session IDs)
injected by upstream services. When recording fixtures, the match key
includes this dynamic data. On replay, the same prompt with different
timestamps doesn't match the stored fixture.

Example: Hindsight memory server injects Event Date: 2026-03-30T13:19:38
into extraction prompts. Each test run has a different timestamp, so
recorded fixtures never match on replay.

Solution

Add requestTransform to MockServerOptions — a function that normalizes
requests before both matching and recording:

const mock = new LLMock({
  requestTransform: (req) => ({
    ...req,
    messages: req.messages.map(m => ({
      ...m,
      content: typeof m.content === "string"
        ? m.content.replace(/\d{4}-\d{2}-\d{2}T[\d:.+Z]+/g, "")
        : m.content,
    })),
    embeddingInput: req.embeddingInput?.split(" | ")[0],
  }),
});

Recording: saves the transformed match key — no timestamps in fixture.
Matching: transforms incoming request before comparison — same clean key.

When requestTransform is set, string matching switches from includes
(substring) to === (exact equality). This prevents false positive matches
from shortened keys accidentally matching unrelated prompts. Without a
transform, existing includes behavior is preserved (backward compatible).

Follows the Polly.js pattern of
composable request normalizers for deterministic snapshot matching.

Changes

types.ts: Add requestTransform to MockServerOptions and HandlerDefaults
router.ts: Optional 4th param on matchFixture, exact match when transform set
server.ts: Thread transform into defaults
recorder.ts: Apply transform before match key extraction
All handlers: Pass defaults.requestTransform as 4th arg to matchFixture
docs/record-replay.html: Document requestTransform feature
8 new router tests for transform behavior, exact matching, backward compat

chore: release 0.1.0

Add pre-commit hook and CLAUDE.md

Make husky prepare script graceful for fresh installs

Signed-off-by: Tyler Slaton <tyler@copilotkit.ai>

chore: release 1.0.0

docs: update CLAUDE.md for conventional commits

Add unit tests badge to README

docs: add CopilotKit kite favicon

Add handler modules for two new LLM provider APIs, both following the established pattern from responses.ts: convert inbound request to ChatCompletionRequest, match fixtures, convert response back to provider-specific format. Claude Messages API (/v1/messages): - Streaming via event: type / data: json SSE format - Non-streaming JSON responses - Full message lifecycle: message_start through message_stop - Tool use with input_json_delta streaming - msg_ and toolu_ ID prefixes Google Gemini GenerateContent API: - /v1beta/models/{model}:generateContent (non-streaming) - /v1beta/models/{model}:streamGenerateContent (streaming) - data-only SSE format (no event prefix, no [DONE]) - functionCall/functionResponse round-trips with synthetic IDs - FUNCTION_CALL finishReason for tool call responses Also adds generateMessageId() and generateToolUseId() helpers, server routes for both providers, and comprehensive tests.

Rename the project from @copilotkit/mock-openai to @copilotkit/llmock to reflect multi-provider scope (OpenAI, Anthropic, Google Gemini). - Class: MockOpenAI → LLMock - Files: mock-openai.ts → llmock.ts, mock-openai.test.ts → llmock.test.ts - Package: @copilotkit/mock-openai → @copilotkit/llmock - CLI: "Usage: mock-openai" → "Usage: llmock" - Binary: mock-openai → llmock - All imports, tests, and docs updated - Clean break — no backward-compat alias

…lmock Add Claude + Gemini provider support, rename to LLMock

Update README.md and docs/index.html to reflect the rename from mock-openai/MockOpenAI to llmock/LLMock throughout. Add documentation for Claude Messages API and Gemini GenerateContent endpoints, update the MSW comparison table with multi-provider rows, and add ANTHROPIC_BASE_URL/Gemini base URL examples.

Add src/__tests__/api-conformance.test.ts with 52 tests validating that mock server output structurally matches each real API spec: OpenAI Chat Completions, OpenAI Responses API, Anthropic Claude Messages API, Google Gemini, and cross-provider invariants. Tests cover required fields, types, value enums, event sequences, headers, and ID prefix formats.

…llmock Rename docs to llmock + add multi-provider docs and API conformance tests

getTextContent now supports ContentPart[] content (e.g. [{type:"text", text:"..."}]) as sent by some SDKs like Strands. Empty-string text parts are filtered out, returning null instead of "".

…upport Add getTextContent for array-format message content

Tests hitting real LLM APIs cost money, time out, and are flaky. The old copy focused on multi-process architecture; the new copy leads with what users actually care about.

Rewrite 'Why llmock' to lead with the problem

ci: add workflow_dispatch trigger to release workflow

prependFixture() inserts a fixture at the front of the list (index 0), replacing the pattern of addFixture() + splice/unshift via `as any`. getFixtures() returns a readonly view of the fixture array, replacing direct access to the private `fixtures` field via `as any`. Both methods are needed by ag-ui's e2e test setup to prepend a tool-result catch-all fixture and log fixture statistics.

…-fixtures Add prependFixture() and getFixtures() public API

…end-get-fixtures Add changeset for 1.1.0 release (prependFixture/getFixtures)

- metrics.test.ts: add test that injects a faulty registry via spy to verify the try-catch in res.on("finish") prevents process crashes; rename existing test for accuracy - stream-collapse.test.ts: update CRC mismatch tests to assert result.truncated === true (replaced console.warn spy pattern)

…ion test Clamp x-llmock-chaos-* header values to [0,1] and warn on NaN or out-of-range input. Restore universal clamping in resolveChaosConfig to cover fixture-level and server-default rates (regression from prior change). Fix file-level docstring to accurately describe the three chaos actions. Add tests for header clamping/NaN behavior and disconnect chaos action end-to-end.

…nish callback Wrap the res.on('finish') metrics block in try/catch to prevent instrumentation errors (wrong label cardinality, registry misconfiguration) from propagating silently or crashing the request handler. Log failures at warn level so operators see them without enabling debug logging.

Change providerKey parameter type from string to RecordProviderKey in collapseStreamingResponse, proxyAndRecord, handleGemini, and handleCompletions. Catches provider key typos at compile time. Add console.warn for unknown SSE provider fallback and document the OpenAI fallback behavior in the docstring. Add TODO comments for CollapseResult discriminated union and chunkSize helper centralization. Fix test comment and cast for unknown-provider fallback path.

…d time Add error-severity validation checks in validateFixtures for streamingProfile (ttft >= 0, tps > 0, jitter in [0,1]) and chaos (all rates in [0,1]). Catches nonsensical streaming physics and out-of-range chaos rates early with clear error messages rather than silently producing broken behavior at request time.

…G chaos flags - docker.html: fix health probes (TCP socket → httpGet on /health and /ready) - docker.html: remove "CLI Configuration (v1.7.0)" section (references non-existent --config flag and aimock binary name) - docker.html: fix --chaos-error-rate → --chaos-drop/--chaos-malformed/--chaos-disconnect - docker.html: fix mountPath /fixtures → /app/fixtures (matches actual values.yaml) - docs.html: add POST /v2/chat (Cohere) and POST /api/generate (Ollama) to endpoint table - CHANGELOG.md: fix "via --chaos CLI flag" → list all three chaos flags - README.md: fix chaos-testing link (chaos.html → chaos-testing.html)

… bedrock SSE; body timeout - chaos.ts: add optional logger param to resolveChaosConfig/evaluateChaos/applyChaos; replace all console.warn calls with logger?.warn - stream-collapse.ts: logger param on collapseStreamingResponse; replace console.warn; add explicit case "bedrock" routing to collapseAnthropicSSE; add bounds check in decodeEventStreamFrames — return {frames, truncated:true} when totalLength extends past buffer, preventing out-of-bounds reads on malformed/truncated EventStream frames - recorder.ts: pass defaults.logger to collapseStreamingResponse; add res.setTimeout body accumulation timeout (30s) to prevent unbounded memory growth on slow responses - bedrock.ts: update module docstring to describe all four endpoint families - all handlers: pass defaults.logger as final arg to all applyChaos call sites

…edrock SSE, and body timeout - chaos.test.ts: verify evaluateChaos without logger does not call console.warn; verify invalid chaos header with logLevel:silent is silently ignored end-to-end - stream-collapse.test.ts: verify bounds check returns {truncated:true} for oversized totalLength; verify provider="bedrock" routes to collapseAnthropicSSE - recorder.test.ts: verify proxyAndRecord calls res.setTimeout(30_000) on upstream IncomingMessage

…ation, type unions - recorder.ts: fix misleading 'saving raw response' log → 'saving as error fixture' - recorder.ts: warn when stream collapse produces empty content - recorder.ts: preserve both empty-match and truncation warnings in fixture JSON - cli.ts: exit(1) on zero fixtures in strict/validate mode - server.ts: warn on out-of-range chaos config values at startup - bedrock.ts/messages.ts: narrow content block type from string to union - aws-event-stream.ts: fix writeEventStream docstring return semantics

…3004)

…Kit#53) ## Summary Major feature release adding 8 capabilities to llmock, plus 29 bugs found and fixed in code review. ### Provider Endpoints - **Bedrock Streaming** — invoke-with-response-stream (AWS Event Stream binary) + Converse API - **Vertex AI** — Routes to existing Gemini handler - **Ollama** — /api/chat, /api/generate, /api/tags (NDJSON streaming) - **Cohere** — /v2/chat (typed SSE events) ### Infrastructure - **Chaos Testing** — Probabilistic drop/malformed/disconnect, three precedence levels (header > fixture > server), rate clamping to [0,1] - **Prometheus Metrics** — Opt-in /metrics, counters, cumulative histograms, gauges ### Record-and-Replay - **Proxy-on-miss** — Real API responses saved as fixtures with 30s upstream timeout - **Stream collapsing** — 6 functions (SSE, NDJSON, EventStream) supporting both Converse and Messages formats - **Strict mode (503)** — Catch missing fixtures in CI - **Auth safety** — Forwarded but redacted in journal, never in fixtures ### Quality - **1250 tests** across 37 files - 7 rounds of 7-agent code review, 29 bugs found and fixed - Build/format/lint clean, zero external dependencies, zero as-any in source ## Review Fixes (29 total across 7 rounds) ### Round 1: Original review (20 findings) - HandlerDefaults type extracted, fixing silent undefined access in 5 handlers - Provider-specific error formats (Anthropic, Gemini, Bedrock) - Recorder binary relay corruption (UTF-8 round-trip on EventStream) - collapseOllamaNDJSON tool_calls + buildFixtureResponse priority - ChaosAction dedup, RecordProviderKey union, OllamaMessage.role union - collapseCohereSSE naming, chaos rate clamping, recorder auth comment - SKILL.md 503 status, warn log level, README provider list, types.ts header ### Round 2 (2 findings) - applyChaos registry argument missing in 5 handlers (chaos metrics incomplete) - Bedrock Converse response format missing in buildFixtureResponse ### Round 5 — fresh context (2 findings) - Global recordCounter → crypto.randomUUID() (concurrent test determinism) - rawBody pass-through in OpenAI completions proxy path ### Round 6 — fresh context (2 findings) - 30s upstream timeout in makeUpstreamRequest (prevents indefinite hangs) - collapseBedrockEventStream: handle both Converse (camelCase) and Messages (flat type) formats ### Round 7 — fresh context (3 findings) - new URL() validation with specific 502 error for malformed provider URLs - writtenToDisk flag to prevent misleading "Response recorded" log on write failure - res.on("error") handler for upstream response stream mid-transfer drops All fixes have corresponding regression tests.

Automated weekly update based on competitor README analysis.

Writes a markdown file with a change table and mermaid flowchart grouped by competitor. Mermaid node labels are quoted and subgraph IDs sanitized to handle special characters in competitor/capability names.

Covers markdown table generation, mermaid flowchart structure, special character escaping (parentheses, quotes, slashes), competitor grouping, node ID uniqueness, and file I/O.

Pass --summary to the script and use gh pr create --body-file to inject the markdown directly, avoiding shell interpolation of backticks from the mermaid code fences.

## Summary - The `update-competitive-matrix.ts` script now accepts `--summary <path>` to write a markdown summary with a change table and mermaid flowchart grouped by competitor - The workflow uses `--body-file` to inject the summary directly into the PR body, avoiding shell interpolation of mermaid backtick fences - Mermaid node labels are quoted and subgraph IDs sanitized to handle special characters (parentheses, slashes, quotes) - Unit tests cover formatting, mermaid structure, escaping, and edge cases ## Test plan - [x] `pnpm test` — 1318 tests pass (13 new) - [x] `pnpm run format:check` — clean - [x] `pnpm run lint` — clean - [x] `pnpm run build` — clean - [ ] Trigger workflow via `workflow_dispatch` and confirm PR body renders correctly

…ording Optional requestTransform on MockServerOptions normalizes requests before fixture matching. When set, string comparisons use exact equality (===) instead of includes() for deterministic recorded-fixture replay. - matchFixture gets optional 4th parameter, threaded from all handlers - Recorder applies transform before building fixture match keys - 8 new tests cover transform behavior, backward compat, and predicate passthrough Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

pkg-pr-new · 2026-03-31T22:23:12Z

Open in StackBlitz

npm i https://pkg.pr.new/CopilotKit/llmock/@copilotkit/llmock@63

commit: 75d0dfa

jpr5

Code Review — 7-Agent Standard CR

335 lines across 17 files reviewed by 7 specialized agents.

The feature design is sound and well-motivated — requestTransform solves a real problem with dynamic data in recorded fixtures. The implementation is mechanically correct across all 17 files, with consistent threading of the new parameter through every handler. Three items need addressing before merge.

Bugs

1. Docs claim "RegExp and predicate matching are unaffected" — RegExp IS affected

docs/record-replay.html states:

RegExp and predicate matching are unaffected

But the code in router.ts applies effectiveReq (the transformed request) to all matching criteria including regex — only predicates receive the original req. Your own test at line 125 ("regexp does not match when transform changes the text") proves this by asserting that regex fails when the transform changes the content.

Fix: Change to something like:

Only predicate matching is unaffected — predicates always receive the original (untransformed) request. All other match criteria (including RegExp) operate on the transformed request.

Also update the JSDoc on requestTransform in types.ts — it mentions matching but omits the effect on recording and the includes() → === behavioral switch.

2. Transform that throws crashes the handler with opaque "Internal error"

requestTransform is user-supplied code called unprotected in matchFixture():

const effectiveReq = requestTransform ? requestTransform(req) : req;

If the transform throws (TypeError, user logic error, etc.), the exception propagates up through every handler into the HTTP response path. Users get a generic 500 "Internal error" with zero indication their transform was the problem.

In recorder.ts, it's worse — the upstream response has already been fetched but is lost when the transform throws during fixture-key building.

In WebSocket handlers, the matchFixture call is outside the JSON parse try/catch, so a throwing transform crashes the message handler.

Fix: Wrap the transform invocation in try/catch. Log the actual error with context identifying the transform as the source. Return null from matchFixture (no match) on failure, or create a small wrapper:

let effectiveReq: ChatCompletionRequest;
try {
  effectiveReq = requestTransform ? requestTransform(req) : req;
} catch (err) {
  // need logger access — either pass it in or wrap at call sites
  return null;
}

Since matchFixture doesn't have logger access, the wrapping may need to happen at call sites. Either approach works.

3. Transform that drops `messages` causes TypeError

If a user's transform returns an object without messages (e.g. { model: req.model }), getLastMessageByRole(effectiveReq.messages, "user") throws TypeError: Cannot read properties of undefined (reading 'length'). TypeScript interfaces provide no runtime enforcement.

Fix: Either validate the transform output at the top of matchFixture, or add a guard before getLastMessageByRole:

if (match.userMessage !== undefined) {
  if (!effectiveReq.messages?.length) continue;
  const msg = getLastMessageByRole(effectiveReq.messages, "user");
  // ...
}

Missing Test

4. No test for throwing transform

User-supplied code in a hot path with no error handling deserves a test that documents the contract — whether matchFixture propagates the exception or handles it gracefully. Either behavior is fine, but it should be tested and intentional, not accidental.

Design Discussion (non-blocking)

5. Identity transform silently changes matching semantics

The mere presence of ANY transform (even (r) => r) switches string matching from includes() to ===. A user adding a no-op transform expecting no behavioral change will find previously-matching fixtures stop matching. Worth considering whether exact-match should be a separate exactMatch?: boolean option rather than coupled to transform presence.

6. Transform can mutate original request

If a user writes req.messages = req.messages.filter(...) (mutating in place), effectiveReq === req and the "predicates receive original" contract is silently violated. The docs example uses spread correctly, but nothing enforces immutability. Consider documenting this prominently or using structuredClone(req) before passing to the transform.

7. Type duplication across 8+ locations

The 3 WebSocket handler files inline their own defaults type (6 copies) instead of referencing HandlerDefaults. recorder.ts also has its own inline type. Consider extracting:

export type WebSocketHandlerDefaults = HandlerDefaults & { model: string };
export type RequestTransform = (req: ChatCompletionRequest) => ChatCompletionRequest;

This is a pre-existing pattern the PR inherits, not something it introduced — but since you're touching all these signatures anyway, it's a good time to clean it up.

What's Good

Clean, consistent threading of the new parameter through all 15 matchFixture call sites
Predicate isolation (receiving original req) is a well-thought-out design choice
Dual application in recorder (normalize both at match time and record time) ensures round-trip consistency
8 router tests cover the core behavioral contract well: exact match, regex, embedding, backward compat, predicate isolation
Documentation section with realistic code example

jpr5 and others added 30 commits March 3, 2026 13:43

Merge pull request CopilotKit#2 from CopilotKit/release/0.1.0

ffb2123

chore: release 0.1.0

Merge pull request CopilotKit#3 from CopilotKit/add-pre-commit-hook

cc3906e

Add pre-commit hook and CLAUDE.md

Make husky prepare script graceful for fresh installs

b5380f6

Merge pull request CopilotKit#4 from CopilotKit/fix-husky-prepare

10c0f98

Make husky prepare script graceful for fresh installs

chore: release 1.0.0

8c8bd85

Signed-off-by: Tyler Slaton <tyler@copilotkit.ai>

Merge pull request CopilotKit#5 from CopilotKit/release/1.0.0

1ee4927

chore: release 1.0.0

docs: add unit tests badge to README

b90dfa5

docs: update CLAUDE.md to reflect conventional commit requirement

fb983aa

Merge pull request CopilotKit#7 from CopilotKit/fix-claude-md-commits

7faa0dd

docs: update CLAUDE.md for conventional commits

docs: add CopilotKit kite favicon

131ef6c

Merge pull request CopilotKit#6 from CopilotKit/add-test-badge

c9de7d4

Add unit tests badge to README

Merge pull request CopilotKit#8 from CopilotKit/add-favicon

e530ae6

docs: add CopilotKit kite favicon

Merge pull request CopilotKit#9 from CopilotKit/feat/multi-provider-l…

81ca5c9

…lmock Add Claude + Gemini provider support, rename to LLMock

Merge branch 'main' into feat/multi-provider-llmock

bb0aa94

Merge pull request CopilotKit#10 from CopilotKit/feat/multi-provider-…

c53d59a

…llmock Rename docs to llmock + add multi-provider docs and API conformance tests

fix: handle array-of-parts content in getTextContent and matchFixture

e52cbeb

getTextContent now supports ContentPart[] content (e.g. [{type:"text", text:"..."}]) as sent by some SDKs like Strands. Empty-string text parts are filtered out, returning null instead of "".

Merge pull request CopilotKit#11 from CopilotKit/feat/array-content-s…

c7c6a81

…upport Add getTextContent for array-format message content

ci: add workflow_dispatch trigger to release workflow

dc383f0

docs: rewrite Why llmock section to lead with the problem

5d32b36

Tests hitting real LLM APIs cost money, time out, and are flaky. The old copy focused on multi-process architecture; the new copy leads with what users actually care about.

Merge pull request CopilotKit#13 from CopilotKit/docs/why-llmock-copy

7f3a466

Rewrite 'Why llmock' to lead with the problem

Merge pull request CopilotKit#12 from CopilotKit/ci/workflow-dispatch

31c3b98

ci: add workflow_dispatch trigger to release workflow

docs: document prependFixture() and getFixtures() in README

0f3d6be

Merge pull request CopilotKit#14 from CopilotKit/feat/prepend-and-get…

374e899

…-fixtures Add prependFixture() and getFixtures() public API

chore: add changeset for prependFixture/getFixtures

9948a8b

Merge pull request CopilotKit#15 from CopilotKit/chore/changeset-prep…

8a19ef4

…end-get-fixtures Add changeset for 1.1.0 release (prependFixture/getFixtures)

jpr5 and others added 18 commits March 21, 2026 10:26

test: add unit tests for drift remediation scripts

3657cf1

docs: fix endpoint label (Groq not Azure) and metrics port (4010 not …

c694c9b

…3004)

docs: update competitive matrix from latest competitor data

6ca3f60

Update competitive matrix (CopilotKit#54)

80bea94

Automated weekly update based on competitor README analysis.

feat: add --summary flag to competitive matrix script

97cfb1b

Writes a markdown file with a change table and mermaid flowchart grouped by competitor. Mermaid node labels are quoted and subgraph IDs sanitized to handle special characters in competitor/capability names.

test: add unit tests for competitive matrix summary formatting

e0a2b63

Covers markdown table generation, mermaid flowchart structure, special character escaping (parentheses, quotes, slashes), competitor grouping, node ID uniqueness, and file I/O.

ci: use --body-file for competitive matrix PR body

be8bd34

Pass --summary to the script and use gh pr create --body-file to inject the markdown directly, avoiding shell interpolation of backticks from the mermaid code fences.

iskhakovt force-pushed the feat/request-transform branch from bac282e to 0207739 Compare March 30, 2026 18:52

iskhakovt force-pushed the feat/request-transform branch from 0207739 to 958add3 Compare March 30, 2026 19:43

docs: add requestTransform section to record-replay docs

75d0dfa

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

iskhakovt marked this pull request as ready for review March 30, 2026 20:34

claude bot reviewed Mar 30, 2026

View reviewed changes

iskhakovt force-pushed the feat/request-transform branch 2 times, most recently from 9694329 to 75d0dfa Compare March 31, 2026 02:24

jpr5 requested changes Mar 31, 2026

View reviewed changes

jpr5 force-pushed the main branch 2 times, most recently from f97049f to 2bf6bc3 Compare April 3, 2026 19:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add requestTransform for deterministic matching and recording#63

feat: add requestTransform for deterministic matching and recording#63
iskhakovt wants to merge 169 commits intoCopilotKit:mainfrom
iskhakovt:feat/request-transform

iskhakovt commented Mar 30, 2026 •

edited

Loading

Uh oh!

claude bot left a comment

Uh oh!

pkg-pr-new bot commented Mar 31, 2026 •

edited

Loading

Uh oh!

jpr5 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

iskhakovt commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Changes

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

pkg-pr-new bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jpr5 left a comment

Choose a reason for hiding this comment

Code Review — 7-Agent Standard CR

Bugs

1. Docs claim "RegExp and predicate matching are unaffected" — RegExp IS affected

2. Transform that throws crashes the handler with opaque "Internal error"

3. Transform that drops messages causes TypeError

Missing Test

4. No test for throwing transform

Design Discussion (non-blocking)

5. Identity transform silently changes matching semantics

6. Transform can mutate original request

7. Type duplication across 8+ locations

What's Good

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

iskhakovt commented Mar 30, 2026 •

edited

Loading

pkg-pr-new bot commented Mar 31, 2026 •

edited

Loading

3. Transform that drops `messages` causes TypeError