fix(mcp): prevent + clarify empty-args failures on axme_save_memory#155
Merged
Merged
Conversation
Two agent sessions independently hit axme_save_memory failing with
"expected string, received undefined" for every required field — the
args object arrived empty {}. Controlled testing (see
SAVE_MEMORY_ARG_REPORT.md) confirmed it is NOT a server/handler/schema
defect: every call whose args actually reached the server persisted
fine, and save_decision worked in the same session. Root cause is a
client-side generative slip — the agent emits the tool-call shell while
deferring the heavy free-text fields, and the fill never happens. Two
factors make save_memory uniquely prone: heaviest required surface
among the axme tools, and an over-generalized "batch axme calls in
parallel" habit inherited from the read-tool instructions.
The SDK validates against the zod schema BEFORE our handler runs, so an
empty payload never reaches our code — we cannot echo received keys
without loosening the advertised schema (which would worsen the root
cause by hiding the required fields from the model). So the fixes are
all on the instruction/description surface, which is what actually
steers generation, plus better error text:
A. Custom zod v4 { error } messages on the required fields of
save_memory (type/title/description) and save_decision
(title/decision/reasoning). Instead of a bland "received undefined"
(which the first agent misread as "the server lost my arguments"
and retried 9x before filing a server-bug report), each field now
says it is REQUIRED, must be composed in THIS call, and that an
empty/deferred emission is the usual cause.
B. Hardened tool descriptions: explicit "call standalone, not in a
parallel batch; include all required fields in the same call" plus
a worked example object for save_memory.
C. Clarified server instructions: parallelism is for the READ tools
(oracle/decisions/memories) only; a new SAVE-TOOL RULE states that
save_memory/save_decision/update_safety are called one at a time
with all required fields composed in that same call.
No schema loosening, no behavior change for valid calls (verified:
{type,title,description} still parses; empty {} now returns the
actionable messages). Type-check clean; 613/613 tests pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two separate agent sessions hit
axme_save_memoryfailing with"expected string, received undefined"on every required field — the args object arrived empty{}. One agent retried 9× and filed a server-bug report; another did a controlled root-cause writeup (SAVE_MEMORY_ARG_REPORT.md).Verified it is NOT a server/handler/schema defect:
save_memoryandsave_decisionhandlers read args identically (server.tool(name, desc, schema, async (args) => …)). No per-tool routing difference — disproves the "args not forwarded for this tool" hypothesis.save_decisionworked in the same session, same client.{}payload is rejected by the SDK — our code never sees it.Root cause is client-side: the agent emits the tool-call shell while deferring the heavy free-text fields, and the fill never happens. Two factors make
save_memoryuniquely prone:type+ two large free-text fields).axme_contexttells the agent to call oracle/decisions/memories in parallel — that habit bleeds onto the heavy-payload write tool).Why not "echo received keys"
The report's top hardening idea was to echo received argument keys in the error. But the SDK auto-validates before our handler, so we'd have to loosen the advertised schema and validate manually — which would worsen the root cause by hiding the required fields from the model's tool picker. Dropped. The strict schema is what advertises the required fields in the first place.
Fix (text-only, no schema loosening, no behavior change for valid calls)
A. Custom zod v4
{ error }messages on the required fields ofsave_memory(type/title/description) andsave_decision(title/decision/reasoning). Instead of the bland"received undefined"that got misread as "the server lost my args", each now reads e.g.:B. Hardened tool descriptions: explicit "call standalone, not in a parallel batch; include all required fields in the same call" + a worked example arguments object for
save_memory.C. Clarified server instructions: the
axme_context-start guidance now says parallelism is for the READ tools (oracle/decisions/memories) only, and a new SAVE-TOOL RULE statessave_memory/save_decision/update_safetyare called one at a time with all required fields composed in the same call.Test plan
{ error }messages verified to surface per-field on empty{}(simulated the SDK'sz.object(shape).safeParse({})).{type, title, description}still parses unchanged.npx tsc --noEmitclean.npm test— 613/613 (one flaky concurrent-lock/live-PR subtest passed on rerun; unrelated to this change).npm run buildclean.Effect
A future agent that emits an empty
save_memorycall now gets a first-try, self-correcting message naming each field and telling it to compose them in the same standalone call — instead of a 9×-retry loop ending in a false server-bug report.🤖 Generated with Claude Code