diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..3258672 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,42 @@ +# Contributing + +This repository is an incubation space for the +[Tool Annotations Interest Group](https://modelcontextprotocol.io/community/tool-annotations/charter). +We welcome proposals, schema changes, and reference implementations that +inform a future Extensions Track SEP. + +## What lives here + +- **Specification drafts** — `specification/draft/.mdx`, + one file per extension, written in the same RFC-2119 style as the core MCP + specification (per [SEP-2133](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md)). +- **Decision records** — `docs/decisions.md`. Append, do not rewrite. +- **Open questions** — `docs/open-questions.md`. + +## What does *not* live here + +- Implementation code. Reference implementations live in their own + repositories and are linked from the relevant `specification/draft/*.mdx`. +- Binding specification changes. Those are made through the + [SEP process](https://modelcontextprotocol.io/community/sep-guidelines). + +## Proposing a change to an existing extension + +1. Open a PR against `specification/draft/.mdx`. +2. Update the **Status** and **Changelog** sections in the frontmatter. +3. If the change is breaking (per the SEP-2133 definition), use a new + extension identifier and a new file. +4. Append an entry to `docs/decisions.md` if the change reflects a design + decision worth preserving. + +## Proposing a new extension + +1. Read [SEP-2133, "Experimental Extensions"](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md#experimental-extensions). +2. Open a discussion or PR proposing the identifier and scope. +3. On acceptance, add `specification/draft/.mdx` using the + frontmatter from an existing draft as a template. + +## Code of conduct + +This repository follows the +[MCP Code of Conduct](https://github.com/modelcontextprotocol/.github/blob/main/CODE_OF_CONDUCT.md). diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000..b024e54 --- /dev/null +++ b/LICENSE @@ -0,0 +1,11 @@ +Apache License +Version 2.0, January 2004 +http://www.apache.org/licenses/ + +Per SEP-2133, official MCP extensions are required to be available under the +Apache 2.0 license. Experimental extensions in this repository follow the +same convention so that contributions made here can flow into a future +official extension repository without re-licensing. + +The full text of the Apache License, Version 2.0 is available at: +https://www.apache.org/licenses/LICENSE-2.0 diff --git a/MAINTAINERS.md b/MAINTAINERS.md new file mode 100644 index 0000000..a72eb4f --- /dev/null +++ b/MAINTAINERS.md @@ -0,0 +1,20 @@ +# Maintainers + +This repository is governed by the +[Tool Annotations Interest Group](https://modelcontextprotocol.io/community/tool-annotations/charter). +Day-to-day repository maintenance follows the IG's facilitator structure. + +| Role | Name | Organization | GitHub | +| ----------- | -------------- | ------------ | ---------------------------------------------------- | +| Facilitator | Sam Morrow | GitHub | [@SamMorrowDrums](https://github.com/SamMorrowDrums) | +| Facilitator | Robert Reichel | OpenAI | [@rreichel3](https://github.com/rreichel3) | + +Per [SEP-2133](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md#experimental-extensions), +core maintainers of the modelcontextprotocol organization retain oversight, +including the ability to archive or remove this repository. + +## Per-extension maintainers + +Individual extensions may nominate additional maintainers responsible for +their specification draft and reference implementations. List them in the +`specification/draft/.mdx` frontmatter. diff --git a/README.md b/README.md index dc87aeb..4e14298 100644 --- a/README.md +++ b/README.md @@ -1 +1,123 @@ -# experimental-ext-tool-annotations +# Tool Annotations Interest Group — Experimental Extensions + +> ⚠️ **Experimental** — This repository is an incubation space for the +> [Tool Annotations Interest Group](https://modelcontextprotocol.io/community/tool-annotations/charter). +> Contents are exploratory drafts intended to feed future Extensions Track SEPs +> ([SEP-2133](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2133)). +> They do not represent official MCP specifications or recommendations. + +**Charter:** [modelcontextprotocol.io/community/tool-annotations/charter](https://modelcontextprotocol.io/community/tool-annotations/charter) +**Discord:** [#tool-annotations-ig](https://discord.com/channels/1358869848138059966/1482836798517543073) +**Open work:** [Pull requests](https://github.com/modelcontextprotocol/experimental-ext-tool-annotations/pulls) + +## Why split the work? + +[SEP-1913 (Trust and Sensitivity Annotations)](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913) +bundles four concerns that have proven hard to evaluate as a single unit: a +client-facing trust taxonomy, action-security metadata for tool I/O, a +malicious-activity signal, and propagation rules across session boundaries. + +The sponsor, [@localden](https://github.com/localden), asked the central +question directly in review: the SEP "adds a few schema modifications and a +thorny array-or-scalar polymorphism on enum fields. If the taxonomy turns out +to be wrong, I worry that we can't remove it or easily modify it. Can we do a +potential narrower first cut?" The subsequent design discussion converged on a +layered answer: a small, stable annotation surface on the wire, with richer +evidence kept out-of-band and referenced by a bounded pointer. + +This repo follows that steer. The schema-bearing concerns become **separate +experimental extensions**, each with its own [reverse-DNS identifier](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md#definition), +reference implementation, and path to a future Extensions Track SEP, so drafts +can graduate independently. The concrete data-labelling models that fill an +extension's evidence slot are kept separate again — as interchangeable **schemes** +rather than extensions — so no single academic model is baked into the wire. This +directly addresses the "narrower first cut" ask without throwing away the +combinatoric value of the full set. + +See [docs/decisions.md](docs/decisions.md) for the decision record and +[docs/trust-model.md](docs/trust-model.md) for the shared enforcement model. + +## Extensions + +| Identifier | Status | What it specifies | Reference implementation(s) | +| :--- | :--- | :--- | :--- | +| [`io.modelcontextprotocol/trust-annotations`](specification/draft/trust-annotations.mdx) | Draft skeleton | **Primary extension.** A small, scheme-agnostic client-facing data-classification vocabulary (`sensitive`, `untrusted`) on result `_meta`, plus an optional `evidenceRef` pointer slot that carries richer payloads out-of-band. | Python SDK: [`kapil8811/mcp-trust-annotations`](https://github.com/kapil8811/mcp-trust-annotations) (138-test suite, healthcare demo, LLM usability study). | +| [`io.modelcontextprotocol/action-metadata`](specification/draft/action-metadata.mdx) | Draft skeleton | `inputMetadata` / `returnMetadata` / outcome classifiers (incl. `requires_review`) on `ToolAnnotations`, describing where inputs go, where outputs originate, and what real-world effects a tool can cause. | Originally [SEP-2061 (Action Security Metadata)](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061) by [@rreichel3](https://github.com/rreichel3) — closed 2026-06-13 in favour of this extension; worked example `read_drafts` / `list_inbox` / `send_email`. | + +Each extension is proposed in its own pull request so it can be reviewed and +graduate on its own clock. + +## Data-labelling schemes (the `evidenceRef` slot) + +The extensions above keep the wire vocabulary deliberately small. Richer +labelling lives **out-of-band**, referenced by the `trust-annotations` +[`evidenceRef`](specification/draft/trust-annotations.mdx) pointer, whose `type` +is an open string. A **scheme** is a concrete data-labelling or tool-annotation +approach that fills that slot under a `type` value. A scheme is **not** an +extension and not a sibling of the two above — it is one interchangeable way to +populate the evidence an extension carries, and a deployment can adopt, swap, or +ignore it without touching the extension. + +The [`schemes/`](schemes/) folder collects these approaches. **FIDES** is the +first worked example, defining `ifc.fides.v1`; it is one model among several that +reviewers and the literature have raised, and the slot is designed so any of them +can occupy it: + +| Scheme | `evidenceRef.type` | Source | +| :--- | :--- | :--- | +| FIDES information-flow control (integrity × confidentiality lattice) | `ifc.fides.v1` | [arXiv:2505.23643](https://arxiv.org/abs/2505.23643); emitter candidate [`github-mcp-server`](https://github.com/github/github-mcp-server) | +| Coarse data classification (4-level + regulatory scope) | `data-class.v1` | SEP-1913 taxonomy | +| Design-pattern controls (Plan-Then-Execute, Dual LLM, Map-Reduce) | _candidate_ | [arXiv:2506.08837](https://arxiv.org/abs/2506.08837) | +| Capability-token constraints (SINT) | _candidate_ | pshkv, [SEP-1913 thread](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913) | +| Caller/tool cosigning | _candidate_ | viftode4, SEP-1913 thread | +| Sequence-shape audit records | _candidate_ | marras0914, SEP-1913 thread | +| Tool-call attestation (in-toto / OVERT envelopes) | _candidate_ | [SEP-2787](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2787) | + +Modelling IFC as a scheme rather than a namespace root is deliberate: a top-level +`ifc/` extension would bake one academic model into the wire and foreclose the +others. As one reviewer put it, IFC "fits relatively well if you use annotations" +— an endorsement of IFC *behind* the annotation slot, not as the slot itself. See +[`schemes/README.md`](schemes/README.md) for the full list and the bar for adding +a scheme. + +## Relationship to SEP-1913 + +SEP-1913 remains the canonical place to discuss the overall problem framing. +This repository develops the schema-bearing parts of that proposal as +independently shippable extensions. When an extension here is ready to graduate, +an Extensions Track SEP can reference this repo as the prior art and the working +implementation that SEP-2133 [requires](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md#creation). + +For the full per-SEP plan — what happens to SEP-1913, SEP-2061, SEP-1862 and +others, and the SEP-2127 refactor precedent — see +[docs/sep-disposition.md](docs/sep-disposition.md). + +**Out of scope for these extensions** (see [docs/open-questions.md](docs/open-questions.md)): + +- **`maliciousActivityHint`** — reviewer concerns are structural (it fires at + `tools/resolve` before execution can produce evidence; a boolean is the wrong + granularity for client UX; clients won't trust server self-attestation). If it + returns, it is per-`ContentBlock` with spans, on a different clock. It stays + on the SEP-1913 umbrella rather than in an extension here. +- **Propagation rules** — sensitivity escalation across session boundaries, and + the sequence-shape gap (an annotation surface for "this was call N in a + flagged sequence") remain open. Likely a future extension once the taxonomy + and `evidenceRef` shape are stable. + +## Repository layout + +This repo mirrors the structure of official extension repositories such as +[`ext-auth`](https://github.com/modelcontextprotocol/ext-auth): + +``` +specification/draft/.mdx # one spec per extension (trust-annotations, action-metadata) +schemes/ # data-labelling schemes that fill the evidenceRef slot (FIDES, …) +docs/ # decision log, open questions, related work +MAINTAINERS.md # IG facilitators +``` + +## Contributing + +See [CONTRIBUTING.md](CONTRIBUTING.md). Substantive design discussion happens +on PRs against the relevant `specification/draft/*.mdx` file, in the IG +Discord, and (for cross-extension concerns) on [SEP-1913](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913). diff --git a/docs/decisions.md b/docs/decisions.md new file mode 100644 index 0000000..e3cdd5a --- /dev/null +++ b/docs/decisions.md @@ -0,0 +1,166 @@ +# Decision log + +Append-only record of design decisions for the Tool Annotations IG's trust / +privacy extension work. Newest at the bottom. + +## 2026-06-10 — Split SEP-1913 into independent extensions + +**Decision.** Split the schema-bearing parts of SEP-1913 into separate +experimental extensions, each with its own `io.modelcontextprotocol/…` +identifier and reference implementation, rather than pursuing one broad +Standards Track SEP. + +**Rationale.** @localden's review asked for a *narrower first cut*; the IG +[aligned 2026-05-28](https://github.com/modelcontextprotocol/modelcontextprotocol/discussions/2820) +on an extension-first strategy. Independent extensions can graduate on their own +clock and avoid hard-to-remove schema. + +## 2026-06-10 — Three initial extensions + +**Decision.** `trust-annotations` (primary), `action-metadata`, `ifc-fides`. + +**Rationale.** These are the three pieces with either a reference implementation +or an existing SEP behind them: Kapil's SDK, SEP-2061 (Reichel), and the FIDES +model respectively. + +## 2026-06-10 — FIDES is a profile, not a top-level extension + +**Decision.** Information-flow control is `type: "ifc.fides.v1"`, a profile of +the `trust-annotations` `evidenceRef` slot — not a top-level `io.modelcontextprotocol/ifc` +extension. + +**Rationale.** IFC is one enforcement model among several raised in review +(capability tokens — pshkv; cosigning — viftode4; sequence shape — marras0914). +A top-level `ifc/` root would foreclose those. Reviewer endorsement was for IFC +"if you use annotations" — i.e. as a profile. + +## 2026-06-10 — `evidenceRef.type` is an open string + +**Decision.** `type` MUST remain an open string with a non-binding registry of +well-known values; never a closed enum. Required fields are `digest` and +`canonicalization`; `schema` recommended; `ref` optional. + +**Rationale.** Adapted from the vaaraio / Rul1an convergence in the SEP-1913 +thread. An open `type` is what lets IFC, data-class, sequence-shape, and +attestation profiles share one slot. + +## 2026-06-10 — `requiresReview` moves to `action-metadata` + +**Decision.** `requiresReview` is an `action-metadata` field, not a +`trust-annotations` field. + +**Rationale.** It is a workflow/consent signal, not a data-classification +property. Keeping it out of the trust taxonomy avoids reproducing SEP-1913's +"several concerns in one schema" problem at smaller scale. + +## 2026-06-10 — DataClass demoted to a profile + +**Decision.** The wire taxonomy keeps only the coarse `sensitive` boolean. +The four-level classification + regulatory scope becomes an `evidenceRef` +profile `type: "data-class.v1"`. + +**Rationale.** Coarse binary is universally client-actionable and cheap on the +wire; the richer taxonomy can evolve behind a profile without a breaking schema +change. + +## 2026-06-10 — Parked: maliciousActivityHint, propagation rules + +**Decision.** Neither becomes an extension now; both stay on the SEP-1913 +umbrella. + +**Rationale.** `maliciousActivityHint` has unresolved structural objections +(fires pre-execution at `tools/resolve`; boolean granularity wrong for UX; +clients won't trust server self-attestation). Propagation/sequence-shape needs +the taxonomy and `evidenceRef` stable first. + +## 2026-06-10 — Citations: public sources only + +**Decision.** Reference implementations and motivating examples cite **public** +artifacts — [`github-mcp-server`](https://github.com/github/github-mcp-server), +[`kapil8811/mcp-trust-annotations`](https://github.com/kapil8811/mcp-trust-annotations), +[arXiv:2505.23643](https://arxiv.org/abs/2505.23643) — and index on the public +SEP-1913 review record (esp. @localden). Private/internal implementations are +not named or linked. + +## 2026-06-10 — Pre-flight (SEP-1862) stays core + +**Decision.** These extensions are response-level and do not depend on Tool +Resolution. SEP-1862 remains a core/Standards-Track protocol change. + +**Rationale.** The 2026-05-28 IG meeting concluded pre-flight is inherently a +protocol-level change, not an extension. + +## 2026-06-16 — FIDES is a scheme, not a sibling extension + +**Decision.** Refines the 2026-06-10 "FIDES is a profile" decision. The IFC/FIDES +work moves out of `specification/draft/` (where it sat next to the two +extensions) into a `schemes/` folder. There are **two** extensions +(`trust-annotations`, `action-metadata`); FIDES is **one data-labelling scheme** +(`ifc.fides.v1`) that fills the `trust-annotations` `evidenceRef` slot. + +**Rationale.** FIDES is one model the extension *could* use, not a peer of the +extensions, and must not be presented as a sibling. The original SEP cites it +alongside ShardGuard and "Design Patterns for Securing LLM Agents," and the +SEP-1913 thread adds capability tokens, cosigning, sequence-shape, and +attestation models — so `schemes/` is a folder for interchangeable approaches, +with FIDES as the first worked one. This shows the range the open `evidenceRef` +slot is meant to carry rather than implying IFC is the privileged model. + +## 2026-06-16 — Three pull requests, stacked + +**Decision.** The work ships as three PRs: `trust-annotations` (the base, +carrying shared repo scaffolding), `action-metadata` (stacked on the base), and +the FIDES scheme in `schemes/` (stacked on the base). The two extensions are +independent; the FIDES scheme depends on `trust-annotations` because it fills +that extension's `evidenceRef` slot. + +**Rationale.** Separate PRs let each piece be reviewed and graduate on its own +clock. FIDES stacks on `trust-annotations` because a scheme has no meaning +without the slot it fills. + +## 2026-06-16 — Schemes carry data labels; host architectures do not + +**Decision.** `schemes/` holds **data-labelling** approaches a server attaches to +a result (FIDES, Permissive IFC, AirGapAgent, `data-class`, attestation +envelopes). **Host architectures** — control-flow designs the client/host runs +(CaMeL, the "Design Patterns for Securing LLM Agents" catalogue, Dual-LLM) — are +prior art in [`related-work.md`](./related-work.md), not candidate schemes. + +**Rationale.** A scheme produces a label; an architecture decides what to do with +one. Conflating them would invite a `schemes/camel.md` that has no per-result +payload to define. A capability token such an architecture issues can still be +*referenced* through `evidenceRef`, but the architecture itself is not a scheme. + +## 2026-06-16 — Early SEP-1913 feedback recorded as cited open questions + +**Decision.** The substantive concerns from the original issue (#711) and SEP +(#1913) review — the set-theoretic critique of linear sensitivity, org-defined +vocabularies, the class+regulatory pairing, taint persistence across storage, +per-block byte ranges, sequence-shape, and the false-security risk — are +captured with reviewer attributions in [`open-questions.md`](./open-questions.md) +rather than silently dropped by the narrower cut. + +**Rationale.** The narrow first cut (`sensitive: boolean`) deliberately omits a +lot of debated design. Recording *why*, with links to the people who raised each +point, keeps the history visible and gives each parked item a home to graduate +from (a scheme, an `action-metadata` field, or a future extension) instead of +being re-litigated from scratch. + +## 2026-06-16 — `sensitive` is a lowest-common-denominator floor; emit both + +**Decision.** The coarse `sensitive` boolean is intentionally a +lowest-common-denominator signal — a universal, always-actionable floor that +supports a basic "better than nothing" egress/consent policy even against a +barely-known server. It is the basic, general scheme every participant +understands, **not** a competitor to richer schemes. Servers SHOULD emit **both** +the boolean and a richer `evidenceRef` scheme (e.g. `data-class.v1`, +`ifc.fides.v1`) where they can; `sensitive` MUST NOT be dropped merely because a +scheme is present. + +**Rationale.** Clarifies the original purpose of the boolean (raised by Sam): the +point of keeping it on the wire was never to *replace* richer classification but +to guarantee a floor any client can act on. Richer schemes are strictly more +capable but are not universally implemented, so they cannot be the floor — +layering the two gives universal actionability without capping what advanced +hosts can do. This also answers the "boolean vs. richer taxonomy" tension from +SEP-1913 review: it is not either/or, it is both, at different layers. diff --git a/docs/intent-comment.md b/docs/intent-comment.md new file mode 100644 index 0000000..f3e4946 --- /dev/null +++ b/docs/intent-comment.md @@ -0,0 +1,74 @@ +# Intent comments (posted) + +Both comments below have been **posted**. Kept here as the source of record, +in sync with [sep-disposition.md](./sep-disposition.md). + +- **SEP-1913** umbrella comment — [posted 2026-06-10](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#issuecomment-4675047154). +- **SEP-2061** coordination note — [posted 2026-06-10](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061#issuecomment-4675049171); + @localden then **closed SEP-2061 on 2026-06-13** in favour of the + `action-metadata` extension. + +--- + +## For SEP-1913 + +> **Intent: split this SEP and migrate to the Extensions Track** +> > +> A note on direction for everyone following this thread. When SEP-1913 was +> first framed, the **Extensions Track** (SEP-2133) and the `experimental-ext-*` +> incubation process didn't exist in their current form. They now do, and +> they're a better fit for this work than a single Standards Track SEP. +> > +> Two things pushed us here: +> > +> - @localden's review ask for a **narrower first cut** — the concern that a +> broad taxonomy with array-or-scalar polymorphism is hard to remove or change +> once it lands. +> - The Tool Annotations IG's +> [May 28 decision](https://github.com/modelcontextprotocol/modelcontextprotocol/discussions/2820) +> to pursue trust/privacy as an **experimental extension first**, gather +> adoption evidence, then ask core maintainers to absorb anything. +> > +> So the plan is to \*\*split this proposal into a few small, +> independently-shippable extensions\*\*, each with its own +> `io.modelcontextprotocol/…` identifier, reference implementation, and path to +> an Extensions Track SEP. Incubation is in +> [`experimental-ext-tool-annotations`](https://github.com/modelcontextprotocol/experimental-ext-tool-annotations). +> > +> It's now scaffolded as \*\*two extensions plus a `schemes/` folder\*\* of +> interchangeable data-labelling approaches, shipped as a stacked set of PRs: +> > +> - [#2](https://github.com/modelcontextprotocol/experimental-ext-tool-annotations/pull/2) — repo scaffolding + the `trust-annotations` extension (the base). +> - [#3](https://github.com/modelcontextprotocol/experimental-ext-tool-annotations/pull/3) — the `action-metadata` extension. +> - [#4](https://github.com/modelcontextprotocol/experimental-ext-tool-annotations/pull/4) — FIDES as a data-labelling \*\*scheme\*\* under `schemes/`. +> > +> | Extension | Scope | +> |---|---| +> | `trust-annotations` | The narrow data-classification taxonomy (`sensitive`, `untrusted`) + an open-ended `evidenceRef` pointer for richer, out-of-band evidence. | +> | `action-metadata` | Tool I/O + outcome contract (folds in @rreichel3's SEP-2061). | +> > +> \*\*Data-labelling schemes (the `evidenceRef` slot).\*\* Richer evidence models +> are \*not\* extensions and \*not\* a wire root. They live in `schemes/` as +> interchangeable fillers of the `trust-annotations` `evidenceRef` slot, each +> selected by an `evidenceRef.type` value, so a deployment can adopt one, several, +> or none without changing the extension. FIDES information-flow control +> ([arXiv:2505.23643](https://arxiv.org/abs/2505.23643)) is the first worked scheme +> (`ifc.fides.v1`) — the public/private-repo confidentiality case, with +> github-mcp-server as an emitter example. The folder is built to hold the range of +> other models raised in review (coarse data classification, design-pattern +> controls, capability tokens, cosigning, sequence-shape, attestation). +> > +> Deliberately removed: `maliciousActivityHint` (the structural concerns raised +> here are unresolved) and session-level propagation rules. +> > +> This follows the same Standards-Track → Extensions-Track refactor pattern as +> SEP-2127 (#2893). This PR is now the `trust-annotations` base of the stack; the +> `action-metadata` extension and the `schemes/` folder are stacked on it. +> Everything is still in the incubation phase, so naming, design, and the choice of +> what to put forward as an extension are all open for discussion in the IG. + +--- + +## For SEP-2061 (coordination note) + +> @rreichel3 — splitting this out into an independent extension as discussed: https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#issuecomment-4675047154 \ No newline at end of file diff --git a/docs/open-questions.md b/docs/open-questions.md new file mode 100644 index 0000000..86898e7 --- /dev/null +++ b/docs/open-questions.md @@ -0,0 +1,101 @@ +# Open questions + +Tracked here rather than in the spec drafts, so the drafts stay non-temporal. + +## Cross-cutting + +- **Where does the policy-enforcement engine live** across different user + universes (cross-org, cross-domain)? Engines work well within one universe; + cross-domain is the hard case. (IG 2026-05-28.) +- **Cross-domain integrity verification** — is asymmetric crypto for domain + identity in scope for a future extension, or out of scope entirely? CLI tools + remain a persistent gap for enforcing these constraints. +- **`evidenceRef.type` registry** — who curates the list of well-known scheme + types, and how do we coordinate with attestation SEPs (e.g. SEP-2787) so + values don't collide? + +## trust-annotations + +- **Sensitivity beyond the floor.** The `sensitive` boolean is settled as the + **lowest-common-denominator floor** — a basic, general signal every client can + act on, "better than nothing," with servers encouraged to *also* emit a richer + scheme (see the [decision log](./decisions.md) and the "emit both" guidance in + the spec). The residual open question is narrower: is "coarse boolean + richer + `evidenceRef` scheme" sufficient for regulated flows, or do some hosts need the + classification expressible *on the wire* without a scheme? Current lean: **no + wire escape hatch** — keep the wire floor un-rottable and push the taxonomy + into [`data-class.v1`](../schemes/data-class.md). Background on why a single + scalar/enum was rejected: sensitivity is set-theoretic not linear + ([@JustinCappos](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/711#issuecomment-2967516811)), + reviewers wanted org-defined vocabularies + ([@olaservo](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/711#issuecomment-2968743154), + [@Mossaka](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/711#issuecomment-2971788308)) + and a class+regulatory pairing + ([@krubenok](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#discussion_r3103485194)), + and a baked-in taxonomy is hard to remove + ([@localden](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#issuecomment-4037623595)). +- **Enforcement vs. advisory.** A self-declared `sensitive: true` from a + poorly-configured or malicious server could create a false sense of security + ([@localden](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#issuecomment-4037623595)). + [docs/trust-model.md](./trust-model.md) puts verification in registries; is + that enough without a normative client-side check path? +- Content-block-level vs. result-level attachment — does the draft need a + worked multi-result example before it's implementable? Per-block annotation + with **byte/codepoint ranges** was requested so clients can highlight the + flagged span, not just the whole result + ([@connor4312](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#issuecomment-3849207989)). +- `list_changed`: confirmed response-level annotations don't participate; revisit + only if trust vocabulary ever attaches to tool definitions. + +## Naming (under review) + +These are explicitly unsettled and are being reviewed against the SEP-1913 +record before any rename lands: + +- **Umbrella name.** The original SEP was "Trust *& Sensitivity* Annotations" — + two axes (integrity *and* confidentiality). `trust-annotations` reads as the + integrity half; does the name hide the `sensitive` (confidentiality) half? +- **`untrusted` vs `openWorldHint`.** Same concept, different layer (result vs. + tool definition). Share a name/vocabulary, or keep them deliberately distinct? +- **`evidenceRef.type` vs `evidenceRef.scheme`.** The repo calls these values + "schemes" (`schemes/`); the selector field is `type`. Align the field name? +- **`evidenceRef` itself.** Its ancestor is [@pshkv](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#issuecomment-4196867926)'s + `decision_ref` / `attestation_ref` / `policy_profile`. Is `evidenceRef` the + clearest umbrella, or should the pointer name the kind of thing it references? + +## action-metadata + +- Coexistence vs. replacement of legacy `destructiveHint` / `readOnlyHint` / + `idempotentHint` / `openWorldHint`. +- Open strings vs. closed enums for `destination` / `source` / `sensitivity`. +- Does `requiresReview` need a machine-readable *reason* for good client UX? + +## ifc-fides (scheme) + +- Inline `_meta.ifc` for low-friction adoption vs. always behind `evidenceRef`. +- GitHub Enterprise `internal` repo visibility → `public`/`private` mapping + (audience is the whole org, broader than collaborators; resolved host-side). +- Reader-set resolution is host-side by design — confidentiality join across two + `private` sources needs the intersection, which the opaque wire marker can't + express. Is the 3-step host resolution enough, or do some hosts need a + standard `evidenceRef.ref` shape to locate the originating system? + +## Parked (SEP-1913 umbrella) + +- **`maliciousActivityHint`** — killed in review on three grounds + ([@connor4312](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#issuecomment-3849207989)): + it can't be resolved before execution for dynamic tools; a boolean gives no + UX or ranges; and clients won't trust a server's self-report of its own + maliciousness. If it returns, it is per-`ContentBlock` with spans, driven by + the **host's** own detection, not a server-attested boolean. +- **Taint persistence across the store boundary** — a label must survive + round-tripping through storage: write a card number to a file, read it back, + and the sensitivity label must not silently disappear + ([@JustinCappos](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/711#issuecomment-2967516811)). + This is a propagation property no single response annotation can guarantee on + its own; it needs the taxonomy and `evidenceRef` stable first. +- **Session-level propagation rules** — escalation semantics and the + **sequence-shape** gap: "this was call N in a flagged sequence" has no response + annotation surface today ([marras0914](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913)). + A candidate `evidenceRef` scheme could carry a sequence assertion; tracked in + [`schemes/README.md`](../schemes/README.md). diff --git a/docs/related-work.md b/docs/related-work.md new file mode 100644 index 0000000..858391f --- /dev/null +++ b/docs/related-work.md @@ -0,0 +1,50 @@ +# Related work + +External references and prior art relevant to the IG's trust / privacy +annotation work. Several were surfaced in IG meetings (notably 2026-05-28). + +## SEPs + +- [SEP-1913 — Trust and Sensitivity Annotations](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913) — the umbrella proposal these extensions derive from. +- [SEP-2061 — Action Security Metadata](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061) — closed 2026-06-13; carried forward as `action-metadata`. +- [SEP-1862 — Tool Resolution / pre-flight checks](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1862) — core-protocol, composes with these extensions. +- [SEP-2133 — Extensions](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md) — the framework this repo incubates under. +- [SEP-2127 — Server Cards](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2893) — precedent for the Standards→Extensions Track refactor. +- [SEP-2787 — Tool Call Attestation](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2787) — candidate `evidenceRef` scheme. + +## Research + +- **FIDES** — *Information-flow control for LLM agents.* [arXiv:2505.23643](https://arxiv.org/abs/2505.23643). Basis for the `ifc.fides.v1` scheme in [`schemes/`](../schemes/). +- **Permissive Information-Flow Analysis for LLMs** — relaxes IFC join so a label propagates only when an input actually influences an output. [arXiv:2410.03055](https://arxiv.org/abs/2410.03055). Candidate `evidenceRef` scheme (per-result label), like FIDES. +- **AirGapAgent** — contextual-integrity minimisation: restrict per-task data to what the context warrants. [arXiv:2405.05175](https://arxiv.org/abs/2405.05175). Candidate scheme: emits a contextual-integrity classification per result. +- **CaMeL — Defeating Prompt Injections by Design** — capability-based control/data-flow extraction. [arXiv:2503.18813](https://arxiv.org/abs/2503.18813). A **host architecture**, not a data-label scheme (see note below); a capability token it issues could be referenced via `evidenceRef`, but the architecture itself is not a scheme. +- **Design Patterns for Securing LLM Agents** — IBM/Google/Microsoft. [arXiv:2506.08837](https://arxiv.org/abs/2506.08837). Plan-Then-Execute, Dual LLM, Map-Reduce, etc. Also **host architectures**, not schemes. +- **Trail of Bits** — prompt-injection via hidden content in GitHub issues. [blog](https://blog.trailofbits.com/2025/08/06/prompt-injection-engineering-for-attackers-exploiting-github-copilot/). +- **OpenAI Auto Review** — https://alignment.openai.com/auto-review/ (shared in IG chat). + +### Schemes vs. host architectures + +The `evidenceRef` slot carries **data labels** — a per-result record a server +can attach (FIDES, Permissive IFC, AirGapAgent, data-class, attestation +envelopes). It does **not** carry **host architectures** — control-flow designs +the *client/host* runs (CaMeL, the Design-Patterns catalogue, Dual-LLM). These +are complementary: an architecture decides what to do with a label, the label is +what a scheme produces. Only the former belong in [`schemes/`](../schemes/). + +## Implementations & tooling + +- [`kapil8811/mcp-trust-annotations`](https://github.com/kapil8811/mcp-trust-annotations) — reference Python SDK PoC for `trust-annotations`. +- [`github-mcp-server`](https://github.com/github/github-mcp-server) — public MCP server; emitter candidate for the `ifc-fides` scheme (knows repo visibility + collaborators). +- **Ethyca** data-labeling docs — https://www.ethyca.com/docs (shared in IG chat). +- **GitHub Next** agentic-workflows research on data labeling — to be documented as issues in this repo (IG action item, @gokhanarkan / @joannakl). + +## Adjacent community proposals (from the SEP-1913 thread) + +- **SINT Protocol** (capability-token constraint enforcement) — pshkv. +- **in-toto** attestations as a trust-annotation substrate. +- **OVERT 1.0** envelope shape for runtime evidence. +- Caller/tool **cosigning** model — viftode4. +- **Sequence-shape** policies — marras0914. + +These are exactly the models that `evidenceRef`'s open `type` is designed to +accommodate as schemes — see [`schemes/`](../schemes/). diff --git a/docs/sep-disposition.md b/docs/sep-disposition.md new file mode 100644 index 0000000..7a1fe83 --- /dev/null +++ b/docs/sep-disposition.md @@ -0,0 +1,114 @@ +# SEP disposition: what happens to the existing proposals + +This document explains how the existing trust/privacy/annotation SEPs map onto +the experimental extensions incubated in this repository, and what is proposed +to happen to each SEP. It exists so that anyone arriving from one of those PRs +can understand the plan without reading the whole thread. + +> **Status:** proposal / options. Nothing here is decided until reflected in the +> relevant SEP PRs. The migration follows the precedent set by +> [SEP-2127 → Extensions Track (#2893)](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2893). + +## Why anything changes + +When [SEP-1913](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913) +was first framed, the **Extensions Track** ([SEP-2133](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md)) +and the `experimental-ext-*` incubation process did not exist in their current +form. Two inputs since then point at a different shape: + +1. **Sponsor steer.** [@localden](https://github.com/localden) asked for a + *narrower first cut* — a single broad taxonomy with array-or-scalar enum + polymorphism is hard to remove or change once shipped. +2. **IG decision.** The Tool Annotations IG + [aligned on 2026-05-28](https://github.com/modelcontextprotocol/modelcontextprotocol/discussions/2820) + to pursue this work as an **experimental extension first**, build an + adoption/evidence base, and only then ask core maintainers to absorb + anything into the protocol. + +The result: split the schema-bearing parts into small, independent extensions, +each able to graduate on its own clock. + +## The precedent: SEP-2127 (Server Cards) + +[#2893](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2893) +refactored SEP-2127 from **Standards Track** to **Extensions Track**: + +- Frontmatter `Type: Standards Track` → `Type: Extensions Track`, plus an + `Extension Identifier` and "on behalf of the WG" attribution. +- A top-of-file `` pointing at the experimental repo as the spec home. +- The SEP body **slimmed to a charter** — Abstract, Motivation, Rationale, a + high-level Specification *pointer*, security posture *summary* — with the + detailed normative wire format delegated to the extension repo. +- The SEP body kept **non-temporal**: a published SEP is frozen, so in-flight + "open items" live in the PR description and extension-repo issues, not in the + SEP text. + +We apply the same playbook below. + +## Per-SEP disposition + +### SEP-1913 — Trust and Sensitivity Annotations + +**Proposed:** becomes the **umbrella / problem-framing** thread. The schema- +bearing content moves into the extensions below. Options, in order of +preference: + +- **(A, preferred)** Keep the PR open as the framing umbrella; add an intent + comment (see [below](#intent-comment)); later, either refactor it to an + Extensions Track *charter* that points here (SEP-2127 shape) **or** close it + in favor of per-extension Extensions Track SEPs once those are ready. +- **(B)** Refactor 1913 itself into the `trust-annotations` Extensions Track + SEP and spin the others off as siblings. +- **(C)** Close 1913 outright and open three fresh Extensions Track SEPs. Loses + the discussion history's continuity; not preferred. + +**Moved into extensions:** `trust-annotations`, `action-metadata`. +**Moved into `schemes/`:** the IFC/FIDES work, as one data-labelling **scheme** +(`ifc.fides.v1`) that fills the `trust-annotations` `evidenceRef` slot — not an +extension and not a sibling of the two above. +**Parked on the umbrella:** `maliciousActivityHint`, +session-level propagation rules. See [open-questions.md](./open-questions.md). + +### SEP-2061 — Action Security Metadata + +**Disposition:** **closed 2026-06-13** in favour of the +[`action-metadata`](../specification/draft/action-metadata.mdx) extension. +SEP-2061 is by [@rreichel3](https://github.com/rreichel3), who is also an IG +co-facilitator and SEP-1913 co-author, so this was a fold-in, not a collision. +[@localden](https://github.com/localden) closed the PR (no active sponsor) after +agreeing the extension is the right home; the extension now carries the field +semantics forward, with SEP-2061 preserved as the origin and credit. + +### SEP-1862 — Tool Resolution (pre-flight checks) + +**Proposed:** **stays Standards Track / core.** The 2026-05-28 IG meeting +concluded pre-flight checks are inherently a protocol-level change, not an +extension. These extensions are deliberately **response-level** (`_meta` on +results, static `ToolAnnotations`) and do **not** depend on 1862. They compose +with it if it lands, but do not block on it. + +### Other related SEPs (not owned here) + +- **SEP-1984 (Comprehensive Tool Annotations)**, **SEP-2417 (Model Preferences + for Tools)** — tracked by the IG as discussion items; not part of these + extensions. Cross-link only. +- **SEP-2787 (Tool Call Attestation)** and the various attestation/evidence + threads — these are natural `evidenceRef` **scheme** candidates rather than + competitors. Coordinate so the `evidenceRef.type` registry can list them. + +## Mapping table + +| SEP | Title | Proposed disposition | Extension home | +| :-- | :-- | :-- | :-- | +| 1913 | Trust & Sensitivity Annotations | Umbrella thread; schema moves to extensions | `trust-annotations` (+ `schemes/ifc-fides`) | +| 2061 | Action Security Metadata | **Closed 2026-06-13**; lives as extension | `action-metadata` | +| 1862 | Tool Resolution (pre-flight) | Stays core / Standards Track | — (composes, no dependency) | +| 1984 | Comprehensive Tool Annotations | IG discussion item | — | +| 2417 | Model Preferences for Tools | IG discussion item | — | +| 2787 | Tool Call Attestation | Candidate `evidenceRef` scheme | (future) | + +## Intent comment + +The text we plan to post on SEP-1913 (and, abbreviated, on SEP-2061) lives in +[intent-comment.md](./intent-comment.md) so it can be reviewed before posting +and kept in sync with this document. diff --git a/docs/trust-model.md b/docs/trust-model.md new file mode 100644 index 0000000..8d85156 --- /dev/null +++ b/docs/trust-model.md @@ -0,0 +1,53 @@ +# Trust model + +A single statement of the enforcement model shared across this repository's +extensions and data-labelling schemes, so individual specs don't re-litigate it. + +## Annotations are claims, not guarantees + +An annotation on a tool result or definition is a **claim** made by whoever +produced it. Nothing in these extensions assumes the producer is honest or +competent. The value of an annotation comes from two things: + +1. **Verifiability.** Where a claim needs to be trusted, the `evidenceRef` + pointer (see [`trust-annotations`](../specification/draft/trust-annotations.mdx)) + lets a consumer resolve and check the evidence behind it — re-hash the + referenced record, verify a signature, check an inclusion proof — rather + than taking the claim on faith. +2. **Accountability.** Enforcement lives with the parties that can impose + consequences: **registries and marketplaces** that admit servers, + **hosts/clients** that gate actions, and **operators** that set policy. The + annotation gives those parties a machine-readable surface to act on. + +This mirrors the framing repeated throughout the SEP-1913 discussion: trust +comes from the ecosystem verifying annotations, not from developer good faith. +A server that lies in its annotations is a server a registry can refuse to list +and a host can refuse to trust — the same accountability model as any other +declared capability. + +## Defense in depth, not a single gate + +As framed in the IG's inaugural meeting, these annotations are **defense in +depth**: they reduce the likelihood of unintended actions (unnecessary +destructive operations, data crossing a boundary it shouldn't), they don't +claim to be a complete security boundary. A host SHOULD combine them with its +own checks (its own injection detection, its own policy engine) rather than +treating any single annotation as authoritative. + +## Human-in-the-loop on the risky edges + +A recurring pattern from the IG research (Joanna's "phases" work, and the +SEP-1913 thread): rather than blanket-blocking flows a policy engine is unsure +about, **flag the specific call for user confirmation**. This preserves utility +while keeping a human on the genuinely risky edges, and is the recommended +default for `requiresReview` ([`action-metadata`](../specification/draft/action-metadata.mdx)) +and for IFC policy violations (the [`ifc-fides`](../schemes/ifc-fides.md) scheme). + +## Cross-domain is the hard case + +Policy engines work well inside a single user/organization universe and +struggle across universes (cross-org, cross-domain flows). These extensions +provide the *signal* (where data came from, how sensitive it is, what a tool +does with it); they do not solve cross-domain enforcement on their own. That +remains an [open question](./open-questions.md) and a likely area for future +work (e.g. asymmetric crypto for domain integrity verification). diff --git a/specification/draft/trust-annotations.mdx b/specification/draft/trust-annotations.mdx new file mode 100644 index 0000000..6fe939e --- /dev/null +++ b/specification/draft/trust-annotations.mdx @@ -0,0 +1,220 @@ +--- +title: Trust Annotations +--- + +**Protocol Revision**: draft + +**Extension identifier:** `io.modelcontextprotocol/trust-annotations` + +> ⚠️ **Experimental draft skeleton.** This document captures the agreed shape +> and the open questions. Normative text is intentionally thin pending +> reference-implementation validation. Substantive discussion happens on PRs +> against this file and on [SEP-1913](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913). + +## Abstract + +This extension defines a small, stable, scheme-agnostic vocabulary for +classifying **data in transit** through MCP tool results, plus an optional +`evidenceRef` pointer that lets a deployment attach richer, out-of-band +evidence without growing the on-wire schema. It is the primary +data-classification extension in the Tool Annotations IG's trust work; other +extensions (information-flow control, action metadata) compose with it rather +than duplicating it. + +The design follows the [@localden](https://github.com/localden) review steer on +SEP-1913 — take a *narrow first cut* of the taxonomy and avoid hard-to-remove +schema — while preserving the layered "small annotation on the wire, rich +evidence out-of-band" consensus that emerged in the SEP-1913 thread. + +## Motivation + +Data crosses tool boundaries today with no standardized markers for whether it +is sensitive or whether it originated from an untrusted source. Clients and +hosts are left to infer this from tool names or model behavior. Two coarse, +broadly-applicable signals cover the majority of client-actionable cases: + +- **`sensitive`** — the content should be treated as confidential (PII, + credentials, proprietary data). Drives consent prompts and egress policy. +- **`untrusted`** — the content originated from an open-world / attacker- + influenceable source (web pages, third-party email, user-generated content). + Drives prompt-injection defenses. + +Anything richer than these two booleans is deliberately **not** on the wire; it +hangs off `evidenceRef` (see below). These two signals are a +**lowest-common-denominator floor**, not a ceiling: servers are encouraged to +*also* attach a richer scheme via `evidenceRef` — see +[Coarse vs. rich](#coarse-vs-rich-classification-dataclass). + +## Specification + +### Dependencies + +This extension depends only on the base MCP `_meta` mechanism. It does not +require Tool Resolution ([SEP-1862](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1862)), +though it composes with it. + +### Annotation shape + +Trust annotations are carried under the extension-namespaced `_meta` key on a +`CallToolResult` (and MAY appear on an individual `ContentBlock` — see +[Attachment point](#attachment-point)). + +```jsonc +{ + "_meta": { + "io.modelcontextprotocol/trust-annotations": { + "sensitive": true, // optional boolean + "untrusted": true, // optional boolean + "evidenceRef": { // optional pointer, see below + "type": "data-class.v1", + "digest": "sha256:…", + "canonicalization": "cbor/rfc8949", + "schema": "https://…/data-class.v1.json", + "ref": "audit://…" // optional locator + } + } + } +} +``` + +Both booleans are optional and default to `false`/absent. Absence MUST be +treated as "no claim made," never as "asserted false." + +### The `evidenceRef` slot + +`evidenceRef` is the extension point that keeps the wire schema small while +letting deployments attach arbitrarily rich evidence. Its shape (adapted from +the SEP-1913 discussion): + +| Field | Required | Meaning | +| :--- | :--- | :--- | +| `type` | yes | **Open string** naming the class of the referenced record. NOT an enum. Examples: `"data-class.v1"`, `"ifc.fides.v1"`, `"sequence"`, `"policy-decision"`. | +| `digest` | yes | Hash of the referenced record, so a client holding the data can re-derive it. | +| `canonicalization` | yes | How the digest was computed (e.g. `"cbor/rfc8949"`), so the client can re-hash independently. | +| `schema` | recommended | Identifier/version of the record `ref` resolves to. | +| `ref` | optional | Locator into the deployment's audit/evidence stream. | + +> **Normative intent:** `type` MUST remain an open string. Narrowing it to a +> closed enum would foreclose the capability-token, cosigning, and +> sequence-shape models raised in SEP-1913 review. A non-binding **registry** +> of well-known `type` values is maintained in this repo; unknown `type` values +> MUST be safely ignorable by a client that does not understand them (the +> `digest`/`canonicalization` pair is still a usable, bounded signal). + +This single slot subsumes the previously separate `attestationChainRef` / +`policyDecisionRef` ideas — both become `type` values. + +`canonicalization` is per-reference precisely so different evidence producers can +be re-derived independently. `cbor/rfc8949` and `jcs/rfc8785` (JSON +Canonicalization Scheme) are both valid envelope choices — neither is the default, +and the `type`/`digest`/`canonicalization` triple is the minimum a client needs +for local re-derivation regardless of which is used. + +### Coarse vs. rich classification (DataClass) + +SEP-1913 carried a four-level data classification +(`public` / `personal` / `confidential` / `highly_confidential`) plus a +regulatory scope (e.g. `confidential:hipaa`). **This extension deliberately +keeps only the coarse `sensitive` boolean on the wire.** Richer classification +is recovered as an `evidenceRef` scheme: + +```jsonc +"evidenceRef": { + "type": "data-class.v1", + // record resolves to e.g. { "class": "highly_confidential", "regulatory": ["hipaa"] } + "digest": "sha256:…", + "canonicalization": "jcs/rfc8785" // a JSON-canonicalized record; CBOR is equally valid +} +``` + +This is an explicit scope decision. The boolean is the **lowest-common- +denominator signal**: a universal, always-actionable floor that lets any client +apply a basic egress/consent policy that is *better than nothing*, even against a +server it knows little about. It can be thought of as the basic, general scheme +that every participant understands. Richer schemes are strictly more capable but +are not universally implemented, so they cannot be the floor. + +Servers SHOULD therefore emit **both** when they can: the coarse `sensitive` +boolean for universal actionability, **and** a richer `evidenceRef` scheme (e.g. +`data-class.v1`, `ifc.fides.v1`) for hosts that implement it. The two are +layered, not alternatives — a client that understands the scheme uses it; one +that does not still has the boolean. `sensitive` MUST NOT be omitted merely +because a richer scheme is present. + +### Attachment point + +Annotations attach at the **`CallToolResult` level by default.** A server MAY +additionally annotate an individual `ContentBlock` when it has reason to +localize the signal (e.g. one search result among many is untrusted). When both +are present, the content-block annotation refines the result-level one for that +block; it MUST NOT *weaken* a result-level claim (union semantics — once +`true`, stays `true`). + +### Relationship to existing `*Hint` annotations + +MCP tool definitions already carry an `openWorldHint` (alongside `readOnlyHint` / +`destructiveHint` / `idempotentHint`). `untrusted` is deliberately **not** a +synonym for `openWorldHint`: + +- `openWorldHint` is a property of the **tool definition** — "this tool reaches + an open, attacker-influenceable world" (e.g. a web fetch). It is known at + registration time and does not vary per call. +- `untrusted` is a property of a **specific result** — "*this* returned data + originated from an open-world / untrusted source." A tool whose `openWorldHint` + is `true` may still return trusted data on a given call, and a tool whose + `openWorldHint` is `false` can surface untrusted data it read from storage. + +The original SEP-1913 proposal reused `openWorldHint` for the result-level +"untrusted source" signal and then drew exactly this distinction +([issue #711](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/711)). +Whether the two should share a name or vocabulary is an +[open question](../../docs/open-questions.md). + +### Lifecycle and `list_changed` + +Trust annotations defined here are **response-level**: they describe a specific +tool result and are not part of the tool *definition*. They therefore do **not** +participate in `tools/list_changed`. (If a future revision attaches trust +vocabulary to tool definitions, that surface would follow `list_changed`; this +draft does not.) + +### Propagation + +This extension does **not** specify session-level escalation/propagation rules. +Those remain an [open question](../../docs/open-questions.md) on the SEP-1913 +umbrella. A host MAY implement propagation locally; this extension only +standardizes the per-result annotation. + +## Reference implementation + +[`kapil8811/mcp-trust-annotations`](https://github.com/kapil8811/mcp-trust-annotations) +— a Python SDK with a `@trust_annotated` decorator, `to_wire`/`from_wire` +round-tripping, a policy engine (audit/warn/enforce), a healthcare-scenario +demo, and an LLM-based usability study (138 tests). The SDK predates this +narrowed shape and is being aligned to the two-boolean + `evidenceRef` model. + +## Trust model + +Enforcement does not rest on developer honesty. See +[docs/trust-model.md](../../docs/trust-model.md): registries and marketplaces +verifying annotations are the enforcement layer; the annotation is a *claim*, +and `evidenceRef` is how that claim is made checkable. + +## Open questions + +- `sensitive` is settled as the lowest-common-denominator floor; servers are + encouraged to emit it **and** a richer scheme. Residual: is the + [`data-class.v1` scheme](../../schemes/data-class.md) enough, or do some + regulated flows need the classification *on the wire*? See + [docs/open-questions.md](../../docs/open-questions.md). +- Exact required-vs-recommended split on `evidenceRef` fields. +- Whether content-block-level annotation needs a worked multi-result example + before the draft is implementable. + +## Changelog + +| Date | Change | +| ---------- | ------------------------------------------------------------- | +| 2026-06-10 | Initial draft skeleton. Narrowed to `sensitive` + `untrusted` + `evidenceRef`; DataClass demoted to a profile; `requires_review` moved to `action-metadata`. | +| 2026-06-16 | Show `jcs/rfc8785` alongside `cbor/rfc8949` so canonicalization reads as a per-reference envelope choice, not a default; note the `type`/`digest`/`canonicalization` triple as the local-re-derivation minimum. (Review: @Rul1an.) | +| 2026-06-16 | Add `untrusted` vs `openWorldHint` relationship subsection; frame `sensitive` as the lowest-common-denominator floor and recommend servers emit both the boolean and a richer `evidenceRef` scheme. |