Skip to content

Pass json-schema-ref-no-deref conformance scenario (SEP-2106)#2924

Merged
Kludex merged 12 commits into
mainfrom
sep-2106-json-schema-ref
Jun 20, 2026
Merged

Pass json-schema-ref-no-deref conformance scenario (SEP-2106)#2924
Kludex merged 12 commits into
mainfrom
sep-2106-json-schema-ref

Conversation

@Kludex

@Kludex Kludex commented Jun 20, 2026

Copy link
Copy Markdown
Member

Closes #2903.

SEP-2106 permits the full JSON Schema 2020-12 vocabulary in tool schemas, including $ref. A $ref resolving to a network URI is an SSRF / fetch-DoS vector. The SEP requires:

  • Implementations MUST NOT automatically dereference a $ref that is not a same-document reference.
  • A schema that fails to validate due to an unresolved external $ref SHOULD be rejected rather than treated as permissive.

The SDK already satisfies the MUST NOT: there is no $ref-fetching code anywhere, and jsonschema's referencing backend raises Unresolvable rather than performing an HTTP GET. So no $ref-resolution change is needed.

What changed

  • Added a json-schema-ref-no-deref driver handler to the conformance client. Its absence (not actual dereferencing) was why the scenario failed; the SDK was already compliant.
  • Removed the scenario from both expected-failures baselines.
  • Added a client test confirming a network $ref in an output schema, under a property the structured content does not set, is validated without being dereferenced.

Scope notes

  • I deliberately did not add eager registration-time rejection of external $refs. The SEP only forbids dereferencing them; an external $ref that is never fetched is legitimate 2020-12 (the SEP lists $ref among the keywords it enables). Eager rejection would forbid valid schemas and go beyond the spec. The issue text says "server rejects unresolvable external $ref at registration"; this PR follows the SEP's narrower, conditional SHOULD instead. Happy to revisit if maintainers prefer the stricter reading.
  • The structuredContent widening to any JSON value already landed in Protocol types for 2026-07-28: superset monolith, committed per-version packages, and wire-method maps #2849; I audited the consumers and found no object/truthiness assumptions.
  • Composition-keyword resource bounds (SEP SHOULD: schema depth / validation-time limits) are not implemented here.

Verification

json-schema-ref-no-deref passes on both the 2025 and 2026-07-28 client legs locally, with "Baseline check passed: all failures are expected" (no stale-entry error). Full suite green, 100% coverage, strict-no-cover/pyright/ruff clean.

AI Disclaimer

This PR was developed with the assistance of either Claude or Codex. I've reviewed and verified the changes.

Kludex added 3 commits June 20, 2026 17:24
SEP-2106 permits the full JSON Schema 2020-12 vocabulary in tool schemas,
including $ref. A $ref resolving to a network URI is an SSRF / fetch-DoS
vector, so per the spec implementations MUST NOT auto-dereference any $ref
that is not a same-document reference.

Add a shared ref-walker that rejects external $refs with ExternalSchemaRefError.
The server now rejects them at tool registration; the client rejects them before
validating a tool result instead of attempting to resolve them. Burn down the
json-schema-ref-no-deref conformance scenario in both expected-failures files
and add a driver handler for it.
Move the json-schema-ref-no-deref driver handler, both expected-failures
updates, and the migration guide entry out of this PR; they will land
separately. This PR is now scoped to the core SEP-2106 code and tests.
Move external-$ref rejection into StrictJsonSchema (now in its own
_schema_generator.py) so it happens during schema generation, driven by the
schema_generator parameter of model_json_schema, rather than walking the
schema afterwards. The output-schema path re-raises ExternalSchemaRefError
instead of swallowing it as an unserializable-type ValueError. Drop the
standalone mcp.shared.json_schema_ref helper and the client-side check (the
client receives schemas, it never generates them, and already never fetches
network refs).
Comment thread src/mcp/shared/json_schema_ref.py Outdated
Kludex added 2 commits June 20, 2026 17:39
func_metadata already defined StrictJsonSchema; keep the SEP-2106 external-$ref
rejection there instead of a separate _schema_generator module.
Subclassing Exception (not ValueError) keeps it from being swallowed by the
schema-generation fallback that degrades unserializable types, so the explicit
re-raise is no longer needed. Drop em-dashes from the docstrings.
Comment thread src/mcp/server/mcpserver/tools/base.py Outdated
Add a driver handler for the scenario (the missing handler was why it failed,
not actual dereferencing) and remove it from both expected-failures baselines.
Document the server-side external-$ref rejection in the migration guide.
Comment thread src/mcp/server/mcpserver/utilities/func_metadata.py Outdated
The SEP requires implementations to never auto-dereference network $refs (a
MUST NOT, already satisfied: the SDK has no $ref-fetching code) and to reject a
schema only when validation hits an unresolved external $ref (a conditional
SHOULD). Drop the eager registration-time rejection added earlier - it forbade
legitimate, never-fetched external $refs, going beyond the spec. Keep the
conformance burn-down and add a client test confirming an unexercised network
$ref in an output schema is not dereferenced.
@Kludex Kludex changed the title Reject external JSON Schema $ref in tool schemas (SEP-2106) Pass json-schema-ref-no-deref conformance scenario (SEP-2106) Jun 20, 2026
Kludex added 2 commits June 20, 2026 18:06
The scenario only requires the client to call tools/list against a server
advertising a network $ref; ClientSession never resolves $refs, so the existing
initialize handler (initialize + list_tools) already satisfies it. Register it
under both names instead of duplicating it.
One @register per function: revert the stacked decorators and add a dedicated
handler. It just initializes and lists tools (ClientSession never resolves
$refs), with a docstring explaining why that is sufficient.

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional findings (outside current diff — PR may have been updated during review):

  • 🟡 src/mcp/server/mcpserver/utilities/func_metadata.py:33-40 — Because ExternalSchemaRefError deliberately doesn't subclass ValueError, it escapes the graceful-degradation except tuple in _try_create_model_and_schema and now also fails registration of prompts and resource templates whose return type carries an external $ref via json_schema_extra — even though those output schemas are never advertised to clients (Prompt.from_function / ResourceTemplate.from_function only use the arg model). Consider having those call sites pass structured_output=False, catching ExternalSchemaRefError in _try_create_model_and_schema for non-tool callers, or enforcing the rejection only on actual tool schema paths.

    Extended reasoning...

    What the bug is. SEP-2106 governs tool schemas, and the migration entry only documents tool registration failing. But ExternalSchemaRefError now also breaks registration of prompts and resource templates whose function return type contains an external $ref, even though those output schemas are never sent on the wire — neither Prompt nor ResourceTemplate stores the FuncMetadata output schema; they only use arg_model for parameter listing.

    The code path. Prompt.from_function (src/mcp/server/mcpserver/prompts/base.py:106) and ResourceTemplate.from_function (src/mcp/server/mcpserver/resources/templates.py:68) call func_metadata(fn, skip_names=...) with the default structured_output=None. func_metadata then auto-detects structured output from the return annotation and renders the output model via model.model_json_schema(schema_generator=StrictJsonSchema) inside _try_create_model_and_schema. After this PR, StrictJsonSchema.generate() raises ExternalSchemaRefError when the rendered schema contains an external $ref.

    Why existing code doesn't prevent it. The graceful-degradation except tuple in _try_create_model_and_schema is (PydanticUserError, TypeError, ValueError, SchemaError, ValidationError). ExternalSchemaRefError deliberately subclasses Exception (not ValueError) precisely so it escapes that fallback on the tool path — but that decision means the only containment for non-tool callers of func_metadata no longer applies either, so the exception propagates out of func_metadata() and out of Prompt.from_function / ResourceTemplate.from_function, failing registration.

    Step-by-step proof (reproduced empirically against the PR head):

    class Profile(BaseModel):
        data: Annotated[dict[str, Any], Field(json_schema_extra={"$ref": "https://example.com/profile.json"})]
    
    @mcp.resource("resource://users/{user_id}")
    def get_user(user_id: str) -> Profile: ...
    # ResourceTemplate.from_function -> ExternalSchemaRefError
    
    @mcp.prompt()
    def my_prompt(topic: str) -> Profile: ...
    # Prompt.from_function -> ExternalSchemaRefError

    Both registrations raise ExternalSchemaRefError: Tool schema contains external $ref(s) that MUST NOT be dereferenced (SEP-2106): https://example.com/profile.json .... Before this PR both succeeded: an external $ref in json_schema_extra produces no pydantic warning, so the output schema was generated and simply discarded.

    Impact. A behavioral regression outside the PR's stated SEP-2106 tool-schema scope: prompt/resource-template registration now fails for return types whose schema is never advertised to any client, with a misleading 'Tool schema contains external $ref(s)' message and no mention in docs/migration.md (which only covers tool registration). Mitigating factors: the trigger requires deliberately injecting an external $ref into the return-type model of a prompt/resource function (uncommon), and the failure is loud at registration time rather than silent misbehavior — hence nit severity.

    How to fix. Options: (a) have Prompt.from_function and ResourceTemplate.from_function pass structured_output=False, since they never use the output model; (b) catch ExternalSchemaRefError in _try_create_model_and_schema for callers that don't advertise the schema; or (c) enforce the external-$ref rejection only on the actual tool schema paths (e.g. in Tool.from_function on the rendered input schema and on FuncMetadata.output_schema when it will actually be used as a tool output schema). This is distinct from the already-reported input-path warnings-to-errors change and the data-keyword false-positive: the trigger here is the return type of a non-tool callable.

Kludex added 3 commits June 20, 2026 18:13
# Conflicts:
#	.github/actions/conformance/expected-failures.2026-07-28.yml
#	.github/actions/conformance/expected-failures.yml
Patch socket.connect / create_connection to fail on any outbound connection
while validating a tool result, proving the network $ref is never fetched.
The conformance scenario already proves no-fetch end to end with a real canary
HTTP server; the unit test was a redundant local mirror.
@Kludex

Kludex commented Jun 20, 2026

Copy link
Copy Markdown
Member Author

Why the conformance wiring is the only change needed:

SEP-2106's security requirement is a MUST NOT dereference network $refs, not a MUST reject them. The SDK already satisfies it without any code change:

  • There is no $ref-fetching machinery anywhere in src/ - no RefResolver, no custom referencing.Registry, no retrieval hook.
  • The only place a tool schema is validated against data is the client's jsonschema.validate(structured_content, output_schema) call (session.py). It uses jsonschema 4.18+'s default referencing backend, which - unlike the old RefResolver - does not auto-fetch: an unresolved network $ref raises Unresolvable instead of issuing an HTTP GET.
  • ClientSession.list_tools() only deserializes the tool list; it never walks inputSchema or resolves $refs. So the client is structurally incapable of dereferencing during listing - exactly what the conformance scenario checks.

So the scenario was already passing behaviourally; it failed in CI only because our conformance driver (client.py) had no handler registered for json-schema-ref-no-deref, which made the driver exit 1 ("Client never requested tools/list"). Adding the handler + burning the scenario down in both expected-failures baselines is all that's required.

I deliberately did not add eager registration-time rejection of external $refs. The SEP only forbids dereferencing them; an external $ref that is never fetched is valid JSON Schema 2020-12 (the SEP lists $ref among the keywords it enables), so rejecting at registration would forbid legitimate schemas and go beyond the spec. The narrower SHOULD ("reject a schema that fails to validate due to an unresolved external $ref") is already met: validation raises rather than treating the ref as permissive.

The structuredContent widening to any JSON value landed separately in #2849; I audited the consumers for object/truthiness assumptions and found none.

@Kludex Kludex merged commit 5734acf into main Jun 20, 2026
32 checks passed
@Kludex Kludex deleted the sep-2106-json-schema-ref branch June 20, 2026 16:24

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — this revision only adds a conformance driver handler and burns down the two expected-failures baselines; no SDK code is touched.

Extended reasoning...

Overview

The PR now touches only three files under .github/actions/conformance/: a new json-schema-ref-no-deref scenario handler in the conformance driver (client.py), and removal of that scenario from the two expected-failures baselines. Earlier revisions of this branch modified SDK source (func_metadata.py, json_schema_ref.py, tools/base.py), but those changes have been dropped and my prior inline comments on them are resolved/moot — the final diff contains no src/ changes at all.

Security risks

None introduced by this diff. The change is CI test-harness wiring; it does not add any $ref-fetching or schema-handling code. The maintainer comment in the thread confirms the SDK already satisfies SEP-2106's MUST NOT (the jsonschema/referencing backend raises Unresolvable rather than fetching), and the conformance scenario was failing only because the driver lacked a registered handler.

Level of scrutiny

Low. The new handler mirrors the existing initialize handler pattern exactly (connect, initialize, list tools), and the baseline edits are mechanical removals. If the scenario did not actually pass, the conformance CI job would fail on the unexpected-failure check, so the change is self-verifying. The only finding is a nit about the PR description still mentioning a unit test that was deliberately dropped (commit c7522f2) — a documentation discrepancy, not a code defect.

Other factors

The conformance runner fails on stale baseline entries, so leaving the entries in place was not an option once the handler exists. The author reports both the 2025 and 2026-07-28 client legs pass locally, and the scope decision (no eager registration-time rejection) is explicitly discussed and endorsed in the thread.

Comment on lines +180 to +189
@register("json-schema-ref-no-deref")
async def run_json_schema_ref_no_deref(server_url: str) -> None:
"""Initialize and list tools; the scenario fails only if the client fetches a network $ref.

ClientSession never walks inputSchema or resolves $refs, so listing is enough (SEP-2106).
"""
async with streamable_http_client(url=server_url) as (read_stream, write_stream):
async with ClientSession(read_stream, write_stream) as session:
await session.initialize()
await session.list_tools()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The PR description states that a client test was added confirming a network $ref in an output schema is validated without being dereferenced, but that test was deliberately dropped in commit c7522f2 ('Drop the Python no-deref unit test') and the final diff contains only the conformance handler and the two expected-failures baselines. This is a description/documentation discrepancy only — either update the description (including the Verification section) to reflect that coverage now lives solely in the external conformance scenario, or re-add the in-repo test.

Extended reasoning...

What the issue is. The PR description's "What changed" section claims: "Added a client test confirming a network $ref in an output schema, under a property the structured content does not set, is validated without being dereferenced." The Verification section likewise reads as if SDK-level test coverage exists. However, the final diff touches only three files — .github/actions/conformance/client.py, expected-failures.yml, and expected-failures.2026-07-28.yml — and contains no test file at all.

How this happened. The branch history shows the test did exist at one point: commit 2c47fc4 ("Guard the no-deref test with a socket tripwire") added/modified a test in tests/client/test_output_schema_validation.py, and the final commit c7522f2 ("Drop the Python no-deref unit test") deliberately removed it, on the rationale that the conformance scenario already proves the no-fetch behavior end to end. The drop was intentional and reasonable — but the PR description was never updated to match.

Step-by-step verification.

  1. The changed-files list for this PR is exactly: .github/actions/conformance/client.py, .github/actions/conformance/expected-failures.yml, .github/actions/conformance/expected-failures.2026-07-28.yml. No file under tests/ is touched.
  2. git log on the branch shows 2c47fc4 adding/guarding the unit test and c7522f2 removing it (≈43 lines deleted from tests/client/test_output_schema_validation.py).
  3. A search of tests/ for no-deref / dereference coverage of output-schema $refs finds nothing remaining.
  4. Yet the description still says the client test was "Added", so a reviewer reading the description would expect an in-repo regression test that does not exist in the merged content.

Why it matters. Reviewers and future maintainers use the PR description to understand what coverage exists. As written, it overstates the in-repo coverage: the only thing asserting the no-dereference behavior is the external json-schema-ref-no-deref conformance scenario, which lives in the conformance harness rather than the SDK's own test suite. If that scenario is ever reorganized or the harness pin changes, there is no local regression test backing the behavior — and anyone relying on the description would believe there is.

This is not a code defect. All verifiers agreed the test removal itself was a defensible scoping decision (the unit test was a redundant local mirror of the conformance scenario), so this is purely a documentation discrepancy and does not block the PR.

How to fix. Either (a) update the PR description — remove or reword the "Added a client test…" bullet and adjust the Verification section to say the no-deref behavior is covered by the json-schema-ref-no-deref conformance scenario only — or (b) re-add the dropped unit test from 2c47fc4 if maintainers prefer in-repo coverage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SEP-2106 — JSON Schema 2020-12 + structuredContent

2 participants