From 36c7c97be12966ef7f473768fec3e4e344b64f27 Mon Sep 17 00:00:00 2001 From: Sam Morrow Date: Tue, 16 Jun 2026 01:23:28 +0200 Subject: [PATCH 1/3] repo: make trust-annotations the base; FIDES becomes a scheme, not a sibling MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Restructure the incubation into two extensions plus interchangeable data-labelling schemes, shipped as a stack of three PRs: - This PR (base): shared scaffolding + the trust-annotations extension. - action-metadata moves to its own stacked PR. - The IFC/FIDES work moves out of specification/draft/ entirely — it is one data-labelling scheme (ifc.fides.v1) that fills the trust-annotations evidenceRef slot, not an extension and not a sibling. It lands in a schemes/ folder built to hold alternative approaches (data-class, capability tokens, cosigning, sequence-shape, attestation) so no single academic model is baked into the wire. README, sep-disposition, related-work, trust-model, open-questions and the decision log are reframed accordingly (two extensions + a schemes folder; the range of candidate schemes drawn from the SEP-1913 thread and cited literature). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- README.md | 63 +++++--- docs/decisions.md | 28 ++++ docs/open-questions.md | 2 +- docs/related-work.md | 8 +- docs/sep-disposition.md | 9 +- docs/trust-model.md | 6 +- specification/draft/action-metadata.mdx | 134 ---------------- specification/draft/ifc-fides.mdx | 193 ------------------------ 8 files changed, 87 insertions(+), 356 deletions(-) delete mode 100644 specification/draft/action-metadata.mdx delete mode 100644 specification/draft/ifc-fides.mdx diff --git a/README.md b/README.md index 2efc8ad..4e14298 100644 --- a/README.md +++ b/README.md @@ -25,11 +25,14 @@ potential narrower first cut?" The subsequent design discussion converged on a layered answer: a small, stable annotation surface on the wire, with richer evidence kept out-of-band and referenced by a bounded pointer. -This repo follows that steer. Each concern becomes a **separate experimental -extension** with its own [reverse-DNS identifier](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md#definition), -its own reference implementation, and its own path to a future Extensions -Track SEP. Drafts can graduate independently — directly addressing the "narrower -first cut" ask without throwing away the combinatoric value of the full set. +This repo follows that steer. The schema-bearing concerns become **separate +experimental extensions**, each with its own [reverse-DNS identifier](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md#definition), +reference implementation, and path to a future Extensions Track SEP, so drafts +can graduate independently. The concrete data-labelling models that fill an +extension's evidence slot are kept separate again — as interchangeable **schemes** +rather than extensions — so no single academic model is baked into the wire. This +directly addresses the "narrower first cut" ask without throwing away the +combinatoric value of the full set. See [docs/decisions.md](docs/decisions.md) for the decision record and [docs/trust-model.md](docs/trust-model.md) for the shared enforcement model. @@ -40,19 +43,42 @@ See [docs/decisions.md](docs/decisions.md) for the decision record and | :--- | :--- | :--- | :--- | | [`io.modelcontextprotocol/trust-annotations`](specification/draft/trust-annotations.mdx) | Draft skeleton | **Primary extension.** A small, scheme-agnostic client-facing data-classification vocabulary (`sensitive`, `untrusted`) on result `_meta`, plus an optional `evidenceRef` pointer slot that carries richer payloads out-of-band. | Python SDK: [`kapil8811/mcp-trust-annotations`](https://github.com/kapil8811/mcp-trust-annotations) (138-test suite, healthcare demo, LLM usability study). | | [`io.modelcontextprotocol/action-metadata`](specification/draft/action-metadata.mdx) | Draft skeleton | `inputMetadata` / `returnMetadata` / outcome classifiers (incl. `requires_review`) on `ToolAnnotations`, describing where inputs go, where outputs originate, and what real-world effects a tool can cause. | Originally [SEP-2061 (Action Security Metadata)](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061) by [@rreichel3](https://github.com/rreichel3) — closed 2026-06-13 in favour of this extension; worked example `read_drafts` / `list_inbox` / `send_email`. | -| [`io.modelcontextprotocol/ifc-fides`](specification/draft/ifc-fides.mdx) | Draft skeleton | A **profile** of the `trust-annotations` `evidenceRef` slot: `type: "ifc.fides.v1"` carrying an integrity + confidentiality label for deterministic information-flow control, following the FIDES paper ([arXiv:2505.23643](https://arxiv.org/abs/2505.23643)). | Emitter candidate: [`github-mcp-server`](https://github.com/github/github-mcp-server) (does not emit IFC labels today — closing that gap is the proof point). | -### Why FIDES is a profile, not a top-level extension - -Information-flow control is modelled as a profile rather than the namespace -root because IFC (an integrity × confidentiality lattice) is one enforcement -model among several that reviewers raised — capability tokens, caller/tool -cosigning, and sequence-shape audit records. A top-level `ifc/` root would bake -one academic model into the namespace and foreclose the others. As one reviewer -put it, IFC "fits relatively well if you use annotations" — an endorsement of -IFC *as a profile*, not as the wire root. As a `type` value under -`trust-annotations`'s open-ended `evidenceRef` slot, the FIDES work stays -first-class while every other model can occupy the same slot. +Each extension is proposed in its own pull request so it can be reviewed and +graduate on its own clock. + +## Data-labelling schemes (the `evidenceRef` slot) + +The extensions above keep the wire vocabulary deliberately small. Richer +labelling lives **out-of-band**, referenced by the `trust-annotations` +[`evidenceRef`](specification/draft/trust-annotations.mdx) pointer, whose `type` +is an open string. A **scheme** is a concrete data-labelling or tool-annotation +approach that fills that slot under a `type` value. A scheme is **not** an +extension and not a sibling of the two above — it is one interchangeable way to +populate the evidence an extension carries, and a deployment can adopt, swap, or +ignore it without touching the extension. + +The [`schemes/`](schemes/) folder collects these approaches. **FIDES** is the +first worked example, defining `ifc.fides.v1`; it is one model among several that +reviewers and the literature have raised, and the slot is designed so any of them +can occupy it: + +| Scheme | `evidenceRef.type` | Source | +| :--- | :--- | :--- | +| FIDES information-flow control (integrity × confidentiality lattice) | `ifc.fides.v1` | [arXiv:2505.23643](https://arxiv.org/abs/2505.23643); emitter candidate [`github-mcp-server`](https://github.com/github/github-mcp-server) | +| Coarse data classification (4-level + regulatory scope) | `data-class.v1` | SEP-1913 taxonomy | +| Design-pattern controls (Plan-Then-Execute, Dual LLM, Map-Reduce) | _candidate_ | [arXiv:2506.08837](https://arxiv.org/abs/2506.08837) | +| Capability-token constraints (SINT) | _candidate_ | pshkv, [SEP-1913 thread](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913) | +| Caller/tool cosigning | _candidate_ | viftode4, SEP-1913 thread | +| Sequence-shape audit records | _candidate_ | marras0914, SEP-1913 thread | +| Tool-call attestation (in-toto / OVERT envelopes) | _candidate_ | [SEP-2787](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2787) | + +Modelling IFC as a scheme rather than a namespace root is deliberate: a top-level +`ifc/` extension would bake one academic model into the wire and foreclose the +others. As one reviewer put it, IFC "fits relatively well if you use annotations" +— an endorsement of IFC *behind* the annotation slot, not as the slot itself. See +[`schemes/README.md`](schemes/README.md) for the full list and the bar for adding +a scheme. ## Relationship to SEP-1913 @@ -84,7 +110,8 @@ This repo mirrors the structure of official extension repositories such as [`ext-auth`](https://github.com/modelcontextprotocol/ext-auth): ``` -specification/draft/.mdx # one spec per extension +specification/draft/.mdx # one spec per extension (trust-annotations, action-metadata) +schemes/ # data-labelling schemes that fill the evidenceRef slot (FIDES, …) docs/ # decision log, open questions, related work MAINTAINERS.md # IG facilitators ``` diff --git a/docs/decisions.md b/docs/decisions.md index 33ecb8b..a451955 100644 --- a/docs/decisions.md +++ b/docs/decisions.md @@ -89,3 +89,31 @@ Resolution. SEP-1862 remains a core/Standards-Track protocol change. **Rationale.** The 2026-05-28 IG meeting concluded pre-flight is inherently a protocol-level change, not an extension. + +## 2026-06-16 — FIDES is a scheme, not a sibling extension + +**Decision.** Refines the 2026-06-10 "FIDES is a profile" decision. The IFC/FIDES +work moves out of `specification/draft/` (where it sat next to the two +extensions) into a `schemes/` folder. There are **two** extensions +(`trust-annotations`, `action-metadata`); FIDES is **one data-labelling scheme** +(`ifc.fides.v1`) that fills the `trust-annotations` `evidenceRef` slot. + +**Rationale.** FIDES is one model the extension *could* use, not a peer of the +extensions, and must not be presented as a sibling. The original SEP cites it +alongside ShardGuard and "Design Patterns for Securing LLM Agents," and the +SEP-1913 thread adds capability tokens, cosigning, sequence-shape, and +attestation models — so `schemes/` is a folder for interchangeable approaches, +with FIDES as the first worked one. This shows the range the open `evidenceRef` +slot is meant to carry rather than implying IFC is the privileged model. + +## 2026-06-16 — Three pull requests, stacked + +**Decision.** The work ships as three PRs: `trust-annotations` (the base, +carrying shared repo scaffolding), `action-metadata` (stacked on the base), and +the FIDES scheme in `schemes/` (stacked on the base). The two extensions are +independent; the FIDES scheme depends on `trust-annotations` because it fills +that extension's `evidenceRef` slot. + +**Rationale.** Separate PRs let each piece be reviewed and graduate on its own +clock. FIDES stacks on `trust-annotations` because a scheme has no meaning +without the slot it fills. diff --git a/docs/open-questions.md b/docs/open-questions.md index 4d5749e..7adb9d9 100644 --- a/docs/open-questions.md +++ b/docs/open-questions.md @@ -30,7 +30,7 @@ Tracked here rather than in the spec drafts, so the drafts stay non-temporal. - Open strings vs. closed enums for `destination` / `source` / `sensitivity`. - Does `requiresReview` need a machine-readable *reason* for good client UX? -## ifc-fides +## ifc-fides (scheme) - Inline `_meta.ifc` for low-friction adoption vs. always behind `evidenceRef`. - GitHub Enterprise `internal` repo visibility → `public`/`private` mapping diff --git a/docs/related-work.md b/docs/related-work.md index fdaf230..5cdb88f 100644 --- a/docs/related-work.md +++ b/docs/related-work.md @@ -10,11 +10,11 @@ annotation work. Several were surfaced in IG meetings (notably 2026-05-28). - [SEP-1862 — Tool Resolution / pre-flight checks](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1862) — core-protocol, composes with these extensions. - [SEP-2133 — Extensions](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md) — the framework this repo incubates under. - [SEP-2127 — Server Cards](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2893) — precedent for the Standards→Extensions Track refactor. -- [SEP-2787 — Tool Call Attestation](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2787) — candidate `evidenceRef` profile. +- [SEP-2787 — Tool Call Attestation](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2787) — candidate `evidenceRef` scheme. ## Research -- **FIDES** — *Information-flow control for LLM agents.* [arXiv:2505.23643](https://arxiv.org/abs/2505.23643). Basis for the `ifc.fides.v1` profile. +- **FIDES** — *Information-flow control for LLM agents.* [arXiv:2505.23643](https://arxiv.org/abs/2505.23643). Basis for the `ifc.fides.v1` scheme in [`schemes/`](../schemes/). - **Design Patterns for Securing LLM Agents** — IBM/Google/Microsoft. [arXiv:2506.08837](https://arxiv.org/abs/2506.08837). Plan-Then-Execute, Dual LLM, Map-Reduce, etc. - **Trail of Bits** — prompt-injection via hidden content in GitHub issues. [blog](https://blog.trailofbits.com/2025/08/06/prompt-injection-engineering-for-attackers-exploiting-github-copilot/). - **OpenAI Auto Review** — https://alignment.openai.com/auto-review/ (shared in IG chat). @@ -22,7 +22,7 @@ annotation work. Several were surfaced in IG meetings (notably 2026-05-28). ## Implementations & tooling - [`kapil8811/mcp-trust-annotations`](https://github.com/kapil8811/mcp-trust-annotations) — reference Python SDK PoC for `trust-annotations`. -- [`github-mcp-server`](https://github.com/github/github-mcp-server) — public MCP server; emitter candidate for `ifc-fides` (knows repo visibility + collaborators). +- [`github-mcp-server`](https://github.com/github/github-mcp-server) — public MCP server; emitter candidate for the `ifc-fides` scheme (knows repo visibility + collaborators). - **Ethyca** data-labeling docs — https://www.ethyca.com/docs (shared in IG chat). - **GitHub Next** agentic-workflows research on data labeling — to be documented as issues in this repo (IG action item, @gokhanarkan / @joannakl). @@ -35,4 +35,4 @@ annotation work. Several were surfaced in IG meetings (notably 2026-05-28). - **Sequence-shape** policies — marras0914. These are exactly the models that `evidenceRef`'s open `type` is designed to -accommodate as profiles. +accommodate as schemes — see [`schemes/`](../schemes/). diff --git a/docs/sep-disposition.md b/docs/sep-disposition.md index 18e2505..7fe65d0 100644 --- a/docs/sep-disposition.md +++ b/docs/sep-disposition.md @@ -62,7 +62,10 @@ preference: - **(C)** Close 1913 outright and open three fresh Extensions Track SEPs. Loses the discussion history's continuity; not preferred. -**Moved into extensions:** `trust-annotations`, `action-metadata`, `ifc-fides`. +**Moved into extensions:** `trust-annotations`, `action-metadata`. +**Moved into `schemes/`:** the IFC/FIDES work, as one data-labelling **scheme** +(`ifc.fides.v1`) that fills the `trust-annotations` `evidenceRef` slot — not an +extension and not a sibling of the two above. **Parked on the umbrella:** `maliciousActivityHint`, session-level propagation rules. See [open-questions.md](./open-questions.md). @@ -90,14 +93,14 @@ with it if it lands, but do not block on it. for Tools)** — tracked by the IG as discussion items; not part of these extensions. Cross-link only. - **SEP-2787 (Tool Call Attestation)** and the various attestation/evidence - threads — these are natural `evidenceRef` *profile* candidates rather than + threads — these are natural `evidenceRef` **scheme** candidates rather than competitors. Coordinate so the `evidenceRef.type` registry can list them. ## Mapping table | SEP | Title | Proposed disposition | Extension home | | :-- | :-- | :-- | :-- | -| 1913 | Trust & Sensitivity Annotations | Umbrella thread; schema moves to extensions | `trust-annotations` (+ `ifc-fides`) | +| 1913 | Trust & Sensitivity Annotations | Umbrella thread; schema moves to extensions | `trust-annotations` (+ `schemes/ifc-fides`) | | 2061 | Action Security Metadata | **Closed 2026-06-13**; lives as extension | `action-metadata` | | 1862 | Tool Resolution (pre-flight) | Stays core / Standards Track | — (composes, no dependency) | | 1984 | Comprehensive Tool Annotations | IG discussion item | — | diff --git a/docs/trust-model.md b/docs/trust-model.md index 98774e4..8d85156 100644 --- a/docs/trust-model.md +++ b/docs/trust-model.md @@ -1,7 +1,7 @@ # Trust model -A single statement of the enforcement model shared by all extensions in this -repository, so individual specs don't re-litigate it. +A single statement of the enforcement model shared across this repository's +extensions and data-labelling schemes, so individual specs don't re-litigate it. ## Annotations are claims, not guarantees @@ -41,7 +41,7 @@ SEP-1913 thread): rather than blanket-blocking flows a policy engine is unsure about, **flag the specific call for user confirmation**. This preserves utility while keeping a human on the genuinely risky edges, and is the recommended default for `requiresReview` ([`action-metadata`](../specification/draft/action-metadata.mdx)) -and for IFC policy violations ([`ifc-fides`](../specification/draft/ifc-fides.mdx)). +and for IFC policy violations (the [`ifc-fides`](../schemes/ifc-fides.md) scheme). ## Cross-domain is the hard case diff --git a/specification/draft/action-metadata.mdx b/specification/draft/action-metadata.mdx deleted file mode 100644 index dd27eee..0000000 --- a/specification/draft/action-metadata.mdx +++ /dev/null @@ -1,134 +0,0 @@ ---- -title: Action Metadata ---- - -**Protocol Revision**: draft - -**Extension identifier:** `io.modelcontextprotocol/action-metadata` - -> ⚠️ **Experimental draft skeleton.** This carries forward -> [SEP-2061: Action Security Metadata](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061) -> by [@rreichel3](https://github.com/rreichel3) into the IG's experimental repo, -> per the May 28 2026 decision to pursue trust/privacy work as an extension -> first. SEP-2061 was [closed](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061#issuecomment-4675049171) -> on 2026-06-13 in favour of this extension as the home for the work; this draft -> is now the canonical place to discuss the field semantics. - -## Abstract - -This extension adds a small, declarative contract to a tool's static -`ToolAnnotations` describing **what the tool does with data**: where inputs may -go, where outputs originate, and what real-world outcome invoking it can cause. -Where [`trust-annotations`](./trust-annotations.mdx) classifies *data in -transit*, this extension classifies *tool behavior*. The two are complementary -and independently adoptable — a client can consume action metadata without -implementing trust annotations at all. - -## Motivation - -MCP today treats all tool calls as equivalent at the protocol level beyond the -coarse `readOnlyHint` / `destructiveHint` / `idempotentHint` / `openWorldHint` -hints. A tool that reads drafts and a tool that sends email are otherwise -indistinguishable, even though their privacy and consent implications differ -radically. Runtimes fall back to inferring risk from tool names or model -behavior, which does not scale. - -This was reinforced in the May 28 2026 IG meeting: a model often **cannot tell -whether a target is private or public**, and absent that signal it may push -content somewhere it should not. A declarative behavioral contract lets clients -and models make safer decisions without baking domain knowledge into every -model. - -The canonical worked example from SEP-2061: `read_drafts`, `list_inbox`, and -`send_email` can share an identical JSON Schema yet have completely different -security semantics — only action metadata distinguishes them. - -## Specification - -### Dependencies - -This extension annotates the existing `ToolAnnotations` object returned by -`tools/list`. It has no dependency on `trust-annotations` or on Tool Resolution. - -### Fields - -Carried under the extension-namespaced key on `ToolAnnotations`: - -```jsonc -{ - "annotations": { - "io.modelcontextprotocol/action-metadata": { - "inputMetadata": { - "destination": "external", // where input data may be stored/sent - "sensitivity": "personal" // kind of data the tool accepts - }, - "returnMetadata": { - "source": "open-world", // where returned data originates - "sensitivity": "public" - }, - "outcome": "consequential", // benign | consequential | irreversible - "requiresReview": true // host SHOULD seek human confirmation - } - } -} -``` - -| Field | Meaning | -| :--- | :--- | -| `inputMetadata.destination` | Where data passed to the tool may end up (e.g. `local`, `internal`, `external`). | -| `inputMetadata.sensitivity` | The kind of data the tool is designed to accept. | -| `returnMetadata.source` | Where the tool's returned data originates (e.g. `first-party`, `open-world`). | -| `returnMetadata.sensitivity` | The kind of data the tool is designed to return. | -| `outcome` | Real-world effect class: `benign` / `consequential` / `irreversible`. | -| `requiresReview` | The tool author signals that a host SHOULD obtain explicit human confirmation before invocation. | - -> Exact enum value sets are inherited from SEP-2061 as the starting point and -> are **not** re-litigated here. With SEP-2061 now closed, this draft is where -> they evolve. - -### `requiresReview` lives here, deliberately - -`requiresReview` is a **workflow/consent** signal, not a data-classification -property. It was intentionally moved out of [`trust-annotations`](./trust-annotations.mdx) -(which stays strictly data-classifying) to avoid reproducing SEP-1913's -"several concerns in one schema" problem at smaller scale. It sits next to -`outcome` because both describe the *act* of calling the tool rather than the -*data* in flight. - -### Lifecycle and `list_changed` - -These fields are part of the **tool definition** (`ToolAnnotations`). They are -therefore covered by `tools/list_changed`: a server that changes a tool's -action metadata MUST emit `list_changed` as it would for any tool-definition -change. (This is the opposite of `trust-annotations`, which is response-level.) - -## Relationship to existing annotations - -`outcome: irreversible` overlaps conceptually with `destructiveHint` but is -strictly richer (a three-way classification vs. a boolean) and is scoped to the -real-world effect rather than to whether the operation is destructive to -server-side state. The IG will need to decide whether action metadata -*supersedes* or *coexists with* the legacy hints before any graduation. - -## Reference implementation - -Per [SEP-2061](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061): -the `read_drafts` / `list_inbox` / `send_email` worked example with identical -schemas and divergent action metadata. A public MCP server emitting these -fields (candidate: [`github-mcp-server`](https://github.com/github/github-mcp-server)) -would anchor the draft in a real ecosystem. - -## Open questions - -- Coexistence vs. replacement of `destructiveHint` / `readOnlyHint`. -- Whether `destination` / `source` / `sensitivity` enums should be open strings - (consistent with `evidenceRef.type`) or closed enums. -- Whether `requiresReview` needs a machine-readable *reason* (vs. a bare - boolean) to drive good client UX. - -## Changelog - -| Date | Change | -| ---------- | ------------------------------------------------------------------- | -| 2026-06-10 | Initial draft skeleton, carrying SEP-2061 into the experimental repo; absorbed `requiresReview` from the trust taxonomy. | -| 2026-06-15 | SEP-2061 closed in favour of this extension; this draft is now the canonical home for the field semantics. | diff --git a/specification/draft/ifc-fides.mdx b/specification/draft/ifc-fides.mdx deleted file mode 100644 index 622c073..0000000 --- a/specification/draft/ifc-fides.mdx +++ /dev/null @@ -1,193 +0,0 @@ ---- -title: Information-Flow Control (FIDES profile) ---- - -**Protocol Revision**: draft - -**Extension identifier:** `io.modelcontextprotocol/ifc-fides`  ·  **Profile of:** `io.modelcontextprotocol/trust-annotations` - -> ⚠️ **Experimental draft skeleton.** This defines a *profile* of the -> [`trust-annotations`](./trust-annotations.mdx) `evidenceRef` slot. It is **not** -> a standalone wire root — see [Why a profile](#why-a-profile). - -## Abstract - -This extension defines `ifc.fides.v1`, a profile of the `trust-annotations` -`evidenceRef` slot that carries an **information-flow-control label** — -integrity plus confidentiality — following the FIDES model -([arXiv:2505.23643](https://arxiv.org/abs/2505.23643)). A host that implements -deterministic information-flow control can consume these labels to decide -whether a tool call is permitted, without baking the IFC model into the core -protocol or into the `trust-annotations` wire surface. - -## Why a profile - -Information-flow control is one enforcement model among several that reviewers -of SEP-1913 raised — capability tokens, caller/tool cosigning, and -sequence-shape audit records were all put forward. A top-level extension -(`io.modelcontextprotocol/ifc`) would make the FIDES integrity × confidentiality -lattice the namespace root and silently foreclose those other models. - -As a `type` value under the open-ended `evidenceRef` slot, the FIDES label is -first-class while the slot stays free for every other model. One reviewer's -framing captured it: IFC "fits relatively well *if you use annotations*" — an -endorsement of IFC as a profile, not as the wire root. - -## Motivation - -The motivating case is the one raised in the -[2026-05-28 IG meeting](https://github.com/modelcontextprotocol/modelcontextprotocol/discussions/2820): -**a model often cannot tell whether a repository is public or private**, and -lacking that signal it may push private content to a public destination. An IFC -label lets the host track confidentiality (who may read this data) and -integrity (is this data trusted) as context accumulates across tool calls, and -deny or prompt before a flow violates policy. - -A public MCP server is the natural emitter. [`github-mcp-server`](https://github.com/github/github-mcp-server) -returns repository data whose confidentiality follows from repository visibility -and collaborator sets — the same public/private signal — but does **not** emit -IFC labels today. Closing that emitter gap is the concrete proof point for this -profile: a host-side consumer of the label shape already exists, so the missing -half is a server willing to emit it, classifying each resource it returns (see -[per-resource classification](#reference-implementation)). - -## Specification - -### Profile identity - -This profile is selected by `evidenceRef.type == "ifc.fides.v1"` on a -`trust-annotations` annotation. A client that does not implement IFC MUST be -able to ignore it safely (the surrounding `sensitive` / `untrusted` booleans and -the `digest`/`canonicalization` pair remain meaningful). - -### Label payload - -The record referenced by the `evidenceRef` (and, for low-friction adoption, MAY -be inlined by deployments that accept the wire cost) has the shape: - -```jsonc -{ - "integrity": "trusted", // "trusted" | "untrusted" (FIDES §4.1 two-level lattice) - "confidentiality": "public" // "public" | "private" -} -``` - -| Field | Meaning | -| :--- | :--- | -| `integrity` | Two-level integrity lattice (`trusted` ⊑ `untrusted`): trusted data may flow to untrusted sinks, not vice versa. | -| `confidentiality` | `"public"` = world-readable; `"private"` = an opaque marker meaning "restricted to some reader set". The concrete reader set is resolved host-side at policy-decision time (see [Reader-set resolution](#reader-set-resolution)). | - -> **Confidentiality is `public` / `private` only — never a reader list on the -> wire.** Emitting concrete reader identities (e.g. logins) is out of scope: user -> identity is not uniform across servers using different auth methods, the -> identities are themselves access-restricted data, and a single resource can -> have hundreds of readers. The opaque marker keeps the wire shape stable and the -> sensitive resolution host-side. - -### Label semantics - -The load-bearing distinction is between the wire and the host: **wire markers are -advisory hints; reader-set semantics are host-resolved.** The asymmetry between -the two joins below follows from that one cut. - -- **Join on accumulation.** As a session ingests labeled results, the context - label is the *join* of what it has seen: integrity degrades toward - `untrusted`, confidentiality narrows toward the smallest permitted reader set. - The two joins differ in *where* they can be computed, and the difference is - principled rather than incidental: - - **Integrity join is total and wire-computable.** The integrity lattice is - small and closed (`trusted ⊑ untrusted`), so `untrusted` dominates and the - join needs nothing beyond the wire values. - - **Confidentiality join is partial and host-resolved.** Reader sets are open - and host-knowledge-dependent. `public` is the one wire-computable case, - because its reader set is universal (`⊤`): `public ⊔ anything = public`. - `private ⊔ private`, by contrast, is the *intersection* of two reader sets - that the opaque markers don't carry, so it is **not** computable from the - wire — see [Reader-set resolution](#reader-set-resolution). -- **Policy check before egress.** Before a write/egress tool call, the host - checks whether the current context label may flow to the call's target. When - a label is absent, the host falls back to its default (trusted-action) - policy rather than assuming the worst — labels are an *additive* signal. - -> The normative integrity/confidentiality lattice definitions follow the FIDES -> paper, §4.1 and §4.3. This draft references the model rather than restating -> the proofs. - -### Reader-set resolution - -`"private"` is intentionally opaque on the wire — and that opaqueness is a -property of the security model, not a limitation of the spec. A reader set is not -transmissible without policy context, so the wire shape correctly declines to -carry it. Two distinct `"private"` markers (e.g. file contents from two different -private repositories) are **not equal**, and their confidentiality join is **not** -the same `"private"` token: data derived from both may flow only to principals who -can read *both* sources — the intersection of their reader sets. The opaque marker -cannot express this intersection, so a host that needs to make a precise -cross-source flow decision MUST resolve each `"private"` marker to a concrete -reader set before joining. - -Resolution is a host-side concern, performed at policy-decision time: - -1. The host maps each contributing `"private"` label back to its source (e.g. - via the `evidenceRef.ref` locator, or its own record of which tool result - carried the label). -2. The host queries the originating system for the current reader set (e.g. a - repository collaborators lookup) using its own credentials. -3. The host computes the flow decision over the resolved sets (intersection for - a join of multiple private sources) and then discards them. - -**When resolution is unavailable** — the `ref` is absent, the label is -digest-only, or the originating system is unreachable at decision time — the host -MUST NOT treat two opaque labels as equal, and MUST NOT treat `"private"` as -`"public"`. It denies, prompts, or applies its configured fail-closed policy. Two -`"private"` labels are equal only once resolution proves their sources are; until -a source is established, unknown or mixed provenance classifies as `"private"`, -never defaulted to `"public"` from a repository-level shortcut. - -The resolved reader set is a decision-time read performed under the host's own -credentials. It is not a durable grant: a host SHOULD NOT cache it as one or -serialize it back into annotations or evidence unless a deployment explicitly opts -in. This keeps the wire free of user identities while still letting the host make a -precise decision when it holds source provenance plus its own credentials. - -### Relationship to `trust-annotations` - -`ifc-fides` never appears without a host `trust-annotations` annotation -carrying the `evidenceRef`. The booleans are the universally-actionable signal; -the IFC label is the precise, host-checkable evidence behind them. - -## Reference implementation - -- **Consumer:** a host-side IFC engine that parses the `{integrity, - confidentiality}` label, maintains a context label across tool results, and - applies a flow policy before egress operations already exists in practice. - (Linked once a public reference is available.) -- **Emitter (gap / proof point):** [`github-mcp-server`](https://github.com/github/github-mcp-server) - is the candidate — it already knows repository visibility and collaborator - sets, which are the confidentiality inputs. Repository visibility is only a - *default* hint, not the whole story: a public repository can serve - sub-resources that are **not** world-readable (draft security advisories, - draft releases, the collaborator roster itself, authenticated-user fields), so - a correct emitter MUST classify **per resource returned**, not per repository. - That makes the emitter a non-trivial proof point rather than a one-line - `repo.private` read. - -## Open questions - -- Should the label be inlinable on `_meta.ifc` directly for low-friction - adoption, or always behind `evidenceRef` for schema minimalism? (Lean: - permit both; `evidenceRef` is canonical, inline is a convenience.) -- How does GitHub Enterprise `internal` repository visibility map onto the - `public` / `private` confidentiality model? (Audience is the whole org, - strictly broader than collaborators — likely classified `private` and resolved - host-side, or falls back to default policy.) -- Registry coordination with other attestation/evidence profiles - (e.g. SEP-2787) so `evidenceRef.type` values don't collide. - -## Changelog - -| Date | Change | -| ---------- | ------------------------------------------------------------ | -| 2026-06-10 | Initial draft skeleton. Reframed from a top-level `ifc` extension to a `trust-annotations` `evidenceRef` profile (`ifc.fides.v1`). | -| 2026-06-15 | Confidentiality limited to `public` / `private` on the wire (dropped reader-list); added Reader-set resolution section; emitter classifies per resource, not per repository. (Review: @JoannaaKL.) | -| 2026-06-16 | Lead the semantics with the wire-hint / host-resolved split; state the integrity-total vs confidentiality-partial asymmetry as principled (`public` = `⊤` is wire-computable, `private ⊔ private` is not); add fail-closed handling when resolution is unavailable and a no-durable-grant rule for resolved sets. (Review: @Rul1an.) | From 947c547c3d9ff08c769111c5a99f42d3b091dd63 Mon Sep 17 00:00:00 2001 From: Sam Morrow Date: Tue, 16 Jun 2026 01:30:47 +0200 Subject: [PATCH 2/3] Sync intent-comment record with the live SEP-1913 comment The posted SEP-1913 umbrella comment was updated to match the shipped structure: two extensions plus a schemes/ folder, with FIDES as one data-labelling scheme (ifc.fides.v1) rather than an extension/profile. Bring docs/intent-comment.md back in sync as the source of record and add the three stacked PR links (#2 base, #3 action-metadata, #4 FIDES scheme). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/intent-comment.md | 30 +++++++++++++++++++++--------- 1 file changed, 21 insertions(+), 9 deletions(-) diff --git a/docs/intent-comment.md b/docs/intent-comment.md index 7a99730..f3e4946 100644 --- a/docs/intent-comment.md +++ b/docs/intent-comment.md @@ -35,25 +35,37 @@ in sync with [sep-disposition.md](./sep-disposition.md). > an Extensions Track SEP. Incubation is in > [`experimental-ext-tool-annotations`](https://github.com/modelcontextprotocol/experimental-ext-tool-annotations). > > -> Kicking off with a few draft extensions in the tool annotations repo — not -> sure yet whether they'd each need separate repos eventually, or whether -> grouping them in one is fine. That's part of what incubation is for. +> It's now scaffolded as \*\*two extensions plus a `schemes/` folder\*\* of +> interchangeable data-labelling approaches, shipped as a stacked set of PRs: +> > +> - [#2](https://github.com/modelcontextprotocol/experimental-ext-tool-annotations/pull/2) — repo scaffolding + the `trust-annotations` extension (the base). +> - [#3](https://github.com/modelcontextprotocol/experimental-ext-tool-annotations/pull/3) — the `action-metadata` extension. +> - [#4](https://github.com/modelcontextprotocol/experimental-ext-tool-annotations/pull/4) — FIDES as a data-labelling \*\*scheme\*\* under `schemes/`. > > > | Extension | Scope | > |---|---| > | `trust-annotations` | The narrow data-classification taxonomy (`sensitive`, `untrusted`) + an open-ended `evidenceRef` pointer for richer, out-of-band evidence. | > | `action-metadata` | Tool I/O + outcome contract (folds in @rreichel3's SEP-2061). | -> | `ifc-fides` | Information-flow control ([arXiv:2505.23643](https://arxiv.org/abs/2505.23643)) as **one profile** of `evidenceRef`, not a wire root — the public/private-repo confidentiality case, with github-mcp-server as an emitter example. | +> > +> \*\*Data-labelling schemes (the `evidenceRef` slot).\*\* Richer evidence models +> are \*not\* extensions and \*not\* a wire root. They live in `schemes/` as +> interchangeable fillers of the `trust-annotations` `evidenceRef` slot, each +> selected by an `evidenceRef.type` value, so a deployment can adopt one, several, +> or none without changing the extension. FIDES information-flow control +> ([arXiv:2505.23643](https://arxiv.org/abs/2505.23643)) is the first worked scheme +> (`ifc.fides.v1`) — the public/private-repo confidentiality case, with +> github-mcp-server as an emitter example. The folder is built to hold the range of +> other models raised in review (coarse data classification, design-pattern +> controls, capability tokens, cosigning, sequence-shape, attestation). > > > Deliberately removed: `maliciousActivityHint` (the structural concerns raised > here are unresolved) and session-level propagation rules. > > > This follows the same Standards-Track → Extensions-Track refactor pattern as -> SEP-2127 (#2893). This PR will eventually pivot to the `trust-annotations` -> piece itself, with the other schema-bearing pieces moving out into their own -> extensions. Everything is still in the incubation phase, so naming, design, -> and the choice of what to put forward as an extension are all open for -> discussion in the IG. +> SEP-2127 (#2893). This PR is now the `trust-annotations` base of the stack; the +> `action-metadata` extension and the `schemes/` folder are stacked on it. +> Everything is still in the incubation phase, so naming, design, and the choice of +> what to put forward as an extension are all open for discussion in the IG. --- From 34640fb89cd246dd8bb1d8faed30e141bb8b31dd Mon Sep 17 00:00:00 2001 From: Rul1an Date: Tue, 16 Jun 2026 07:52:59 +0200 Subject: [PATCH 3/3] conformance: worked emitter/consumer cases for trust-annotations Reproducible worked cases validating the agreed shape, requested on #2: - Emitter classification for a public repo serving a private subresource: per-resource sensitivity (draft advisory / collaborator roster -> sensitive even though the repo is public), world-readable -> no claim, unknown/mixed -> sensitive (never defaulted public from a repo-level shortcut), and content-block union semantics (refine, never weaken a result-level claim). - evidenceRef re-derivation for type 'policy-decision' and 'sequence': the small annotation on the wire with the rich record out of band, the digest recomputed from the record under both cbor/rfc8949 and jcs/rfc8785 (the same record yields two distinct, each-recomputable digests; neither canonicalization is the default). Everything reproduces from examples.json alone. No third-party dependencies: a pure-Python JCS (RFC 8785) and a minimal canonical CBOR (RFC 8949 4.2.1) encoder, self-tested against the RFCs' own published vectors before any digest is trusted. Restricted value profile (strings, arrays, string-keyed maps, booleans, null, non-negative integers; no floats / number-format edge cases). Signed-off-by: Rul1an --- .../conformance/trust-annotations/README.md | 45 ++ .../trust-annotations/examples.json | 221 ++++++++++ .../trust-annotations/worked_cases.py | 413 ++++++++++++++++++ 3 files changed, 679 insertions(+) create mode 100644 specification/draft/conformance/trust-annotations/README.md create mode 100644 specification/draft/conformance/trust-annotations/examples.json create mode 100644 specification/draft/conformance/trust-annotations/worked_cases.py diff --git a/specification/draft/conformance/trust-annotations/README.md b/specification/draft/conformance/trust-annotations/README.md new file mode 100644 index 0000000..442a05d --- /dev/null +++ b/specification/draft/conformance/trust-annotations/README.md @@ -0,0 +1,45 @@ +# MCP trust-annotations — worked emitter/consumer cases + +Worked cases for the `io.modelcontextprotocol/trust-annotations` extension +([experimental-ext-tool-annotations PR #2](https://github.com/modelcontextprotocol/experimental-ext-tool-annotations/pull/2)), +built at the maintainer's request to validate the agreed shape. Expressed entirely in that +extension's own vocabulary (`sensitive` / `untrusted` booleans + `evidenceRef` +{type, digest, canonicalization, schema, ref}); no outside schema is introduced. + +## What is covered + +**A. Emitter classification — public repo, private subresource.** Visibility is per-resource, not +per-repo: a world-public repository still serves things that are not world-readable (a draft security +advisory, the collaborator roster). The emitter classifies the resource it returns. The wire carries +the coarse `sensitive` boolean; the four-level class rides an out-of-band `evidenceRef` of +`type: "data-class.v1"`. World-readable resources make no `sensitive` claim (absence is "no claim", +never "asserted false"). Unknown / mixed provenance classifies `sensitive: true`, never defaulted to +public from a repo-level shortcut. A content-block annotation may refine a result-level claim but +MUST NOT weaken it (union semantics: once `true`, stays `true`). + +**B. `evidenceRef` re-derivation.** `type: "policy-decision"` and `type: "sequence"` — the small +annotation on the wire (type + digest + canonicalization) with the rich record out of band. A client +holding the record re-derives the digest independently. Shown under **both** `cbor/rfc8949` and +`jcs/rfc8785`; the same record under the two envelopes yields two distinct, each-recomputable digests, +so neither canonicalization reads as the default. + +## Reproducing + +``` +python3 worked_cases.py # self-test the encoders, (re)generate examples.json, verify +python3 worked_cases.py --verify # verify the committed examples.json only +``` + +Everything reproduces from the bytes in `examples.json` alone. The script recomputes every +`evidenceRef.digest` from its committed record under the declared `canonicalization` and asserts the +match, checks the per-resource classification rule, and checks content-block union semantics. + +No third-party dependencies: a pure-Python JCS (RFC 8785) and a minimal canonical CBOR (RFC 8949 +§4.2.1) encoder, **self-tested against the RFCs' own published vectors** before any digest is trusted. + +## Value profile (scope) + +Text strings, arrays, string-keyed maps, booleans, null, and non-negative integers. Floats and the +RFC 8785 number-format edge cases are intentionally out of scope — a record outside the profile raises +rather than hashing silently. This matches the draft's "small record" intent; a production profile +that admits floats would pin number formatting explicitly, which is a separate question for the spec. diff --git a/specification/draft/conformance/trust-annotations/examples.json b/specification/draft/conformance/trust-annotations/examples.json new file mode 100644 index 0000000..0d8479f --- /dev/null +++ b/specification/draft/conformance/trust-annotations/examples.json @@ -0,0 +1,221 @@ +{ + "canonicalizations": [ + "cbor/rfc8949", + "jcs/rfc8785" + ], + "cases": [ + { + "family": "emitter_classification", + "id": "A1_public_repo_draft_advisory", + "narrative": "A public repository serves a draft security advisory. The repo is world-public; the subresource is not. The emitter classifies the resource it returns, so the result is sensitive even though the container is public.", + "record": { + "class": "confidential", + "container_visibility": "public", + "origin": "open_world", + "resource_kind": "security_advisory_draft" + }, + "resource_visibility": "private", + "wire": { + "_meta": { + "io.modelcontextprotocol/trust-annotations": { + "evidenceRef": { + "canonicalization": "jcs/rfc8785", + "digest": "sha256:c509d629b2ab94b573c44ac10acb4941ab3949ff3cf857983c643bb0a476cefb", + "schema": "https://example/data-class.v1.json", + "type": "data-class.v1" + }, + "sensitive": true + } + } + } + }, + { + "family": "emitter_classification", + "id": "A2_public_repo_readme", + "narrative": "The same public repository serves its README. World-readable, so no sensitive claim is made (absence is 'no claim', never 'asserted false'). A data-class.v1 record may still record class=public for downstream audit.", + "record": { + "class": "public", + "container_visibility": "public", + "origin": "open_world", + "resource_kind": "readme" + }, + "resource_visibility": "public", + "wire": { + "_meta": { + "io.modelcontextprotocol/trust-annotations": { + "evidenceRef": { + "canonicalization": "jcs/rfc8785", + "digest": "sha256:363f86281f3e7c0996573bd26b65099dd03c893666a6ad51752194c8bad75bcf", + "type": "data-class.v1" + } + } + } + } + }, + { + "family": "emitter_classification", + "id": "A3_public_repo_collaborator_roster", + "narrative": "The same public repository serves its collaborator roster: not world-readable, so sensitive=true per-resource. Digest-only evidenceRef (no schema/ref) is still a usable bounded signal and remains re-derivable.", + "record": { + "class": "confidential", + "container_visibility": "public", + "origin": "first_party", + "resource_kind": "collaborator_roster" + }, + "resource_visibility": "private", + "wire": { + "_meta": { + "io.modelcontextprotocol/trust-annotations": { + "evidenceRef": { + "canonicalization": "cbor/rfc8949", + "digest": "sha256:a920a4da0a92c06841f9ae11f91aceb29f8d169da18554b386d809142bae9b56", + "type": "data-class.v1" + }, + "sensitive": true + } + } + } + }, + { + "family": "emitter_classification", + "id": "A4_unknown_or_mixed_provenance", + "narrative": "Provenance is not established at emit time (mixed or unknown source). The emitter classifies sensitive=true rather than defaulting to public from a repo-level shortcut; the consumer treats it conservatively until provenance is established.", + "record": { + "class": "unknown", + "origin": "unestablished", + "resource_kind": "mixed" + }, + "resource_visibility": "unknown", + "wire": { + "_meta": { + "io.modelcontextprotocol/trust-annotations": { + "evidenceRef": { + "canonicalization": "jcs/rfc8785", + "digest": "sha256:32926e4d78fe95aea10e44c1557da6304afe2468d8d5ecc3df949135af3cd34b", + "type": "data-class.v1" + }, + "sensitive": true + } + } + } + }, + { + "effective_sensitive": true, + "family": "emitter_classification", + "id": "A5_content_block_union_no_weaken", + "narrative": "One search result among many is the private subresource. The CallToolResult makes no result-level sensitive claim, but the offending ContentBlock sets sensitive=true. Union semantics: the block refines and may only strengthen; effective sensitive=true.", + "record": { + "class": "confidential", + "container_visibility": "public", + "origin": "open_world", + "resource_kind": "security_advisory_draft" + }, + "resource_visibility": "private", + "wire": { + "content_block_meta": { + "io.modelcontextprotocol/trust-annotations": { + "evidenceRef": { + "canonicalization": "jcs/rfc8785", + "digest": "sha256:c509d629b2ab94b573c44ac10acb4941ab3949ff3cf857983c643bb0a476cefb", + "type": "data-class.v1" + }, + "sensitive": true + } + }, + "result_meta": { + "io.modelcontextprotocol/trust-annotations": {} + } + } + }, + { + "family": "evidence_ref_rederivation", + "id": "B1_policy_decision_cbor", + "narrative": "A policy decision kept out of band; the wire annotation carries only the small evidenceRef. A client holding the decision record re-derives the digest under cbor/rfc8949 and matches.", + "record": { + "decision": "deny", + "reasons": [ + "sink_not_in_allowlist", + "input_marked_untrusted" + ], + "rule": "egress.block_unverified_sink", + "stage": "pre_call", + "subject": "tool:web.fetch" + }, + "wire": { + "_meta": { + "io.modelcontextprotocol/trust-annotations": { + "evidenceRef": { + "canonicalization": "cbor/rfc8949", + "digest": "sha256:48dce5db5820f4a6069c659b99075c4881184827fdb27af82079a96d5929f46d", + "ref": "audit://decisions/2026-06-16/0001", + "schema": "https://example/policy-decision.v1.json", + "type": "policy-decision" + } + } + } + } + }, + { + "family": "evidence_ref_rederivation", + "id": "B2_sequence_jcs", + "narrative": "A tool-call sequence shape kept out of band; the wire annotation carries only the small evidenceRef. A client re-derives the digest under jcs/rfc8785 and matches.", + "record": { + "flagged_step": 2, + "sequence": [ + "search", + "open_document", + "send_email" + ], + "window": 3 + }, + "wire": { + "_meta": { + "io.modelcontextprotocol/trust-annotations": { + "evidenceRef": { + "canonicalization": "jcs/rfc8785", + "digest": "sha256:0503307fcfdc6dd705dfe161a8442a31f167a0396ddeeac68883dad72f1944b7", + "schema": "https://example/sequence.v1.json", + "type": "sequence" + } + } + } + } + }, + { + "family": "evidence_ref_rederivation", + "id": "B3_same_record_two_envelopes", + "narrative": "The same data-class record under both canonicalizations yields two distinct, each-recomputable digests \u2014 canonicalization is a per-reference envelope choice, neither is the default.", + "record": { + "class": "confidential", + "container_visibility": "public", + "origin": "open_world", + "resource_kind": "security_advisory_draft" + }, + "wire_cbor": { + "_meta": { + "io.modelcontextprotocol/trust-annotations": { + "evidenceRef": { + "canonicalization": "cbor/rfc8949", + "digest": "sha256:6a1836c14764e8326ba9eb8e7231b45b93580889732f08523db3fa11e5635ed1", + "type": "data-class.v1" + } + } + } + }, + "wire_jcs": { + "_meta": { + "io.modelcontextprotocol/trust-annotations": { + "evidenceRef": { + "canonicalization": "jcs/rfc8785", + "digest": "sha256:c509d629b2ab94b573c44ac10acb4941ab3949ff3cf857983c643bb0a476cefb", + "type": "data-class.v1" + } + } + } + } + } + ], + "extension": "io.modelcontextprotocol/trust-annotations", + "summary": "worked emitter/consumer cases: public-repo private-subresource classification + policy-decision / sequence evidenceRef re-derivation, reproducible from bytes", + "value_profile": "text strings, arrays, string-keyed maps, booleans, null, non-negative integers (no floats / RFC 8785 number edge cases)" +} diff --git a/specification/draft/conformance/trust-annotations/worked_cases.py b/specification/draft/conformance/trust-annotations/worked_cases.py new file mode 100644 index 0000000..36f003c --- /dev/null +++ b/specification/draft/conformance/trust-annotations/worked_cases.py @@ -0,0 +1,413 @@ +#!/usr/bin/env python3 +"""Worked emitter/consumer cases for the MCP `io.modelcontextprotocol/trust-annotations` extension +(experimental-ext-tool-annotations PR #2), expressed entirely in that extension's own vocabulary. + +Two families the maintainer asked for to validate the shape: + + A. Emitter classification — a *public* repo that serves a *private* subresource. Visibility is a + per-resource property, not a per-repo one, so the emitter classifies the resource it returns, + not the container. The wire signal is the coarse `sensitive` boolean; the richer four-level + class rides an out-of-band `evidenceRef` of `type: "data-class.v1"`. Unknown / mixed provenance + classifies `sensitive: true` (conservative), never defaulted to public from a repo-level + shortcut. Content-block annotations use union semantics: a block may refine but MUST NOT weaken + a result-level claim. + + B. `evidenceRef` re-derivation — `type: "policy-decision"` and `type: "sequence"`, the small + annotation on the wire (type + digest + canonicalization) with the rich record out of band. + A client holding the record re-derives the digest independently. Shown under BOTH + `cbor/rfc8949` and `jcs/rfc8785` so neither canonicalization reads as the default; the same + record under the two envelopes yields two distinct, each-recomputable digests. + +Everything reproduces from the bytes in `examples.json` alone: this script recomputes every +`evidenceRef.digest` from the committed record under its declared `canonicalization` and asserts the +match, checks the per-resource classification rule, and checks content-block union semantics. No +third-party dependencies: a pure-Python JCS (RFC 8785) and a minimal canonical CBOR (RFC 8949 +§4.2.1) encoder are used and self-tested against the RFC's own published vectors before any digest is +trusted. + +Restricted value profile (matches the draft's "small record" intent): text strings, arrays, maps +with string keys, booleans, null, and non-negative integers. No floats / no RFC 8785 number-format +edge cases are exercised; a record outside the profile raises rather than hashing silently. + +Usage: + python3 worked_cases.py # self-test, (re)generate examples.json, verify, print PASS + python3 worked_cases.py --verify # verify the committed examples.json only +""" + +from __future__ import annotations + +import argparse +import hashlib +import json +import os +import sys + +HERE = os.path.dirname(os.path.abspath(__file__)) +EXAMPLES = os.path.join(HERE, "examples.json") +EXT = "io.modelcontextprotocol/trust-annotations" + + +# -------------------------------------------------------------------------------------------------- +# Canonicalization (restricted profile), stdlib-only. +# -------------------------------------------------------------------------------------------------- +def _reject_unsupported(v) -> None: + if isinstance(v, bool) or v is None or isinstance(v, str): + return + if isinstance(v, int): + if v < 0: + raise ValueError("restricted profile: non-negative integers only") + return + if isinstance(v, list): + for x in v: + _reject_unsupported(x) + return + if isinstance(v, dict): + for k, val in v.items(): + if not isinstance(k, str): + raise ValueError("restricted profile: object keys must be strings") + _reject_unsupported(val) + return + raise ValueError(f"restricted profile: unsupported type {type(v).__name__}") + + +def jcs_bytes(record) -> bytes: + """RFC 8785 JCS for the restricted profile: sorted keys, no whitespace, UTF-8. (Number-format + canonicalization is not exercised because the profile excludes floats.)""" + _reject_unsupported(record) + return json.dumps(record, sort_keys=True, separators=(",", ":"), ensure_ascii=False).encode("utf-8") + + +def _cbor_head(major: int, n: int) -> bytes: + if n < 24: + return bytes([(major << 5) | n]) + if n < 0x100: + return bytes([(major << 5) | 24, n]) + if n < 0x10000: + return bytes([(major << 5) | 25]) + n.to_bytes(2, "big") + if n < 0x100000000: + return bytes([(major << 5) | 26]) + n.to_bytes(4, "big") + return bytes([(major << 5) | 27]) + n.to_bytes(8, "big") + + +def cbor_bytes(record) -> bytes: + """RFC 8949 core-deterministic encoding for the restricted profile: definite lengths, smallest + integer encoding, map keys sorted by their encoded bytes (§4.2.1).""" + _reject_unsupported(record) + + def enc(v) -> bytes: + if v is True: + return b"\xf5" + if v is False: + return b"\xf4" + if v is None: + return b"\xf6" + if isinstance(v, int): # bool already handled above + return _cbor_head(0, v) + if isinstance(v, str): + b = v.encode("utf-8") + return _cbor_head(3, len(b)) + b + if isinstance(v, list): + return _cbor_head(4, len(v)) + b"".join(enc(x) for x in v) + if isinstance(v, dict): + pairs = sorted(((enc(k), enc(val)) for k, val in v.items()), key=lambda kv: kv[0]) + return _cbor_head(5, len(v)) + b"".join(k + val for k, val in pairs) + raise ValueError(f"unsupported type {type(v).__name__}") + + return enc(record) + + +CANON = {"jcs/rfc8785": jcs_bytes, "cbor/rfc8949": cbor_bytes} + + +def compute_digest(record, canonicalization: str) -> str: + enc = CANON.get(canonicalization) + if enc is None: + raise ValueError(f"unknown canonicalization {canonicalization!r}") + return "sha256:" + hashlib.sha256(enc(record)).hexdigest() + + +# -------------------------------------------------------------------------------------------------- +# Self-test the encoders against the RFCs' own vectors before trusting any digest. +# -------------------------------------------------------------------------------------------------- +def self_test() -> None: + # RFC 8949 Appendix A (the subset within our value profile). + cbor_vectors = [ + (0, "00"), (1, "01"), (10, "0a"), (23, "17"), (24, "1818"), (25, "1819"), + (100, "1864"), (1000, "1903e8"), + (False, "f4"), (True, "f5"), (None, "f6"), + ("", "60"), ("a", "6161"), ("IETF", "6449455446"), + ([], "80"), ([1, 2, 3], "83010203"), + ({}, "a0"), + ({"a": 1, "b": [2, 3]}, "a26161016162820203"), + ({"a": "A", "b": "B", "c": "C", "d": "D", "e": "E"}, + "a56161614161626142616361436164614461656145"), + ] + for value, expected_hex in cbor_vectors: + got = cbor_bytes(value).hex() + assert got == expected_hex, f"CBOR self-test failed for {value!r}: {got} != {expected_hex}" + # CBOR canonical map-key ordering (§4.2.1): keys sorted by encoded bytes, shorter key first. + # {"a":2,"b":1,"aa":3} -> a3 (6161 02)(6162 01)(626161 03). + assert cbor_bytes({"b": 1, "a": 2, "aa": 3}).hex() == "a361610261620162616103" + # JCS sorts keys, drops whitespace, keeps UTF-8. + assert jcs_bytes({"b": 1, "a": 2}) == b'{"a":2,"b":1}' + assert jcs_bytes({"name": "café"}) == '{"name":"café"}'.encode("utf-8") + + +# -------------------------------------------------------------------------------------------------- +# Out-of-band records (what each evidenceRef.digest commits to). +# -------------------------------------------------------------------------------------------------- +# A.1/A.3 — a public repo serving a non-public subresource: the rich four-level class lives here, +# the wire carries only `sensitive: true`. +REC_DRAFT_ADVISORY = { + "class": "confidential", + "origin": "open_world", + "resource_kind": "security_advisory_draft", + "container_visibility": "public", +} +REC_COLLABORATOR_ROSTER = { + "class": "confidential", + "origin": "first_party", + "resource_kind": "collaborator_roster", + "container_visibility": "public", +} +# A.2 — a world-readable resource from the same public repo. +REC_PUBLIC_README = { + "class": "public", + "origin": "open_world", + "resource_kind": "readme", + "container_visibility": "public", +} +# A.4 — provenance not established at emit time: classify sensitive, do not default to public. +REC_UNKNOWN = { + "class": "unknown", + "origin": "unestablished", + "resource_kind": "mixed", +} +# B.1 — a policy decision recorded out of band. +REC_POLICY_DECISION = { + "decision": "deny", + "rule": "egress.block_unverified_sink", + "subject": "tool:web.fetch", + "stage": "pre_call", + "reasons": ["sink_not_in_allowlist", "input_marked_untrusted"], +} +# B.2 — a tool-call sequence shape recorded out of band. +REC_SEQUENCE = { + "sequence": ["search", "open_document", "send_email"], + "window": 3, + "flagged_step": 2, +} + + +def evidence_ref(record, type_: str, canonicalization: str, *, schema=None, ref=None) -> dict: + er = { + "type": type_, + "digest": compute_digest(record, canonicalization), + "canonicalization": canonicalization, + } + if schema is not None: + er["schema"] = schema + if ref is not None: + er["ref"] = ref + return er + + +def build_cases() -> list[dict]: + cases: list[dict] = [] + + # ---- Family A: emitter classification (public repo, private subresource) ---- + cases.append({ + "id": "A1_public_repo_draft_advisory", + "family": "emitter_classification", + "narrative": "A public repository serves a draft security advisory. The repo is world-public; " + "the subresource is not. The emitter classifies the resource it returns, so the " + "result is sensitive even though the container is public.", + "resource_visibility": "private", + "wire": {"_meta": {EXT: { + "sensitive": True, + "evidenceRef": evidence_ref(REC_DRAFT_ADVISORY, "data-class.v1", "jcs/rfc8785", + schema="https://example/data-class.v1.json"), + }}}, + "record": REC_DRAFT_ADVISORY, + }) + cases.append({ + "id": "A2_public_repo_readme", + "family": "emitter_classification", + "narrative": "The same public repository serves its README. World-readable, so no sensitive " + "claim is made (absence is 'no claim', never 'asserted false'). A data-class.v1 " + "record may still record class=public for downstream audit.", + "resource_visibility": "public", + "wire": {"_meta": {EXT: { + "evidenceRef": evidence_ref(REC_PUBLIC_README, "data-class.v1", "jcs/rfc8785"), + }}}, + "record": REC_PUBLIC_README, + }) + cases.append({ + "id": "A3_public_repo_collaborator_roster", + "family": "emitter_classification", + "narrative": "The same public repository serves its collaborator roster: not world-readable, " + "so sensitive=true per-resource. Digest-only evidenceRef (no schema/ref) is still " + "a usable bounded signal and remains re-derivable.", + "resource_visibility": "private", + "wire": {"_meta": {EXT: { + "sensitive": True, + "evidenceRef": evidence_ref(REC_COLLABORATOR_ROSTER, "data-class.v1", "cbor/rfc8949"), + }}}, + "record": REC_COLLABORATOR_ROSTER, + }) + cases.append({ + "id": "A4_unknown_or_mixed_provenance", + "family": "emitter_classification", + "narrative": "Provenance is not established at emit time (mixed or unknown source). The emitter " + "classifies sensitive=true rather than defaulting to public from a repo-level " + "shortcut; the consumer treats it conservatively until provenance is established.", + "resource_visibility": "unknown", + "wire": {"_meta": {EXT: { + "sensitive": True, + "evidenceRef": evidence_ref(REC_UNKNOWN, "data-class.v1", "jcs/rfc8785"), + }}}, + "record": REC_UNKNOWN, + }) + cases.append({ + "id": "A5_content_block_union_no_weaken", + "family": "emitter_classification", + "narrative": "One search result among many is the private subresource. The CallToolResult makes " + "no result-level sensitive claim, but the offending ContentBlock sets sensitive=true. " + "Union semantics: the block refines and may only strengthen; effective sensitive=true.", + "resource_visibility": "private", + "wire": { + "result_meta": {EXT: {}}, + "content_block_meta": {EXT: { + "sensitive": True, + "evidenceRef": evidence_ref(REC_DRAFT_ADVISORY, "data-class.v1", "jcs/rfc8785"), + }}, + }, + "record": REC_DRAFT_ADVISORY, + "effective_sensitive": True, + }) + + # ---- Family B: evidenceRef re-derivation (policy-decision, sequence; both canonicalizations) ---- + cases.append({ + "id": "B1_policy_decision_cbor", + "family": "evidence_ref_rederivation", + "narrative": "A policy decision kept out of band; the wire annotation carries only the small " + "evidenceRef. A client holding the decision record re-derives the digest under " + "cbor/rfc8949 and matches.", + "wire": {"_meta": {EXT: { + "evidenceRef": evidence_ref(REC_POLICY_DECISION, "policy-decision", "cbor/rfc8949", + schema="https://example/policy-decision.v1.json", + ref="audit://decisions/2026-06-16/0001"), + }}}, + "record": REC_POLICY_DECISION, + }) + cases.append({ + "id": "B2_sequence_jcs", + "family": "evidence_ref_rederivation", + "narrative": "A tool-call sequence shape kept out of band; the wire annotation carries only the " + "small evidenceRef. A client re-derives the digest under jcs/rfc8785 and matches.", + "wire": {"_meta": {EXT: { + "evidenceRef": evidence_ref(REC_SEQUENCE, "sequence", "jcs/rfc8785", + schema="https://example/sequence.v1.json"), + }}}, + "record": REC_SEQUENCE, + }) + cases.append({ + "id": "B3_same_record_two_envelopes", + "family": "evidence_ref_rederivation", + "narrative": "The same data-class record under both canonicalizations yields two distinct, " + "each-recomputable digests — canonicalization is a per-reference envelope choice, " + "neither is the default.", + "wire_cbor": {"_meta": {EXT: { + "evidenceRef": evidence_ref(REC_DRAFT_ADVISORY, "data-class.v1", "cbor/rfc8949"), + }}}, + "wire_jcs": {"_meta": {EXT: { + "evidenceRef": evidence_ref(REC_DRAFT_ADVISORY, "data-class.v1", "jcs/rfc8785"), + }}}, + "record": REC_DRAFT_ADVISORY, + }) + return cases + + +# -------------------------------------------------------------------------------------------------- +# Verification: recompute every digest from the record bytes, check classification + union semantics. +# -------------------------------------------------------------------------------------------------- +def _all_evidence_refs(case: dict): + """Yield (evidenceRef, wire_key) for every evidenceRef in a case's wire annotation(s).""" + for key in ("wire", "wire_cbor", "wire_jcs"): + wire = case.get(key) + if not wire: + continue + for meta_holder in ("_meta", "result_meta", "content_block_meta"): + meta = wire.get(meta_holder) + if meta and EXT in meta and "evidenceRef" in meta[EXT]: + yield meta[EXT]["evidenceRef"], key + + +def verify(cases: list[dict]) -> None: + checked = 0 + for case in cases: + # 1. Every evidenceRef digest re-derives from the record bytes under its canonicalization. + for er, _ in _all_evidence_refs(case): + recomputed = compute_digest(case["record"], er["canonicalization"]) + assert recomputed == er["digest"], ( + f"{case['id']}: digest mismatch under {er['canonicalization']}: " + f"{recomputed} != {er['digest']}" + ) + checked += 1 + + # 2. B3: the two envelopes must produce DIFFERENT digests (distinct re-derivable references). + if case["id"] == "B3_same_record_two_envelopes": + d_cbor = case["wire_cbor"]["_meta"][EXT]["evidenceRef"]["digest"] + d_jcs = case["wire_jcs"]["_meta"][EXT]["evidenceRef"]["digest"] + assert d_cbor != d_jcs, "B3: the two canonicalizations should differ" + + # 3. Emitter classification rule (per-resource, fail-closed on unknown). + if case["family"] == "emitter_classification": + vis = case.get("resource_visibility") + if case["id"] == "A5_content_block_union_no_weaken": + result_claim = case["wire"]["result_meta"][EXT].get("sensitive") + block_claim = case["wire"]["content_block_meta"][EXT].get("sensitive") + effective = bool(result_claim) or bool(block_claim) # union; once true stays true + assert effective is True and case["effective_sensitive"] is True + assert block_claim is True and not result_claim, "block must refine, not weaken" + else: + sensitive = case["wire"]["_meta"][EXT].get("sensitive") + if vis in ("private", "unknown"): + assert sensitive is True, f"{case['id']}: {vis} resource must be sensitive=true" + elif vis == "public": + assert sensitive is not True, f"{case['id']}: public resource must not claim sensitive" + print(f"verified {checked} evidenceRef digests across {len(cases)} cases", file=sys.stderr) + + +def main() -> None: + ap = argparse.ArgumentParser() + ap.add_argument("--verify", action="store_true", help="verify committed examples.json only") + args = ap.parse_args() + + self_test() # encoders match the RFC vectors before any digest is trusted + + if args.verify: + with open(EXAMPLES, encoding="utf-8") as fh: + doc = json.load(fh) + cases = doc["cases"] + else: + cases = build_cases() + doc = { + "extension": EXT, + "summary": "worked emitter/consumer cases: public-repo private-subresource classification " + "+ policy-decision / sequence evidenceRef re-derivation, reproducible from bytes", + "canonicalizations": ["cbor/rfc8949", "jcs/rfc8785"], + "value_profile": "text strings, arrays, string-keyed maps, booleans, null, non-negative " + "integers (no floats / RFC 8785 number edge cases)", + "cases": cases, + } + with open(EXAMPLES, "w", encoding="utf-8") as fh: + json.dump(doc, fh, indent=2, sort_keys=True) + fh.write("\n") + + verify(cases) + print("PASS") + + +if __name__ == "__main__": + main()