From 2aea5dc6be8cf45b702c48143f600bf1e2988b06 Mon Sep 17 00:00:00 2001 From: Sam Morrow Date: Wed, 10 Jun 2026 23:50:22 +0200 Subject: [PATCH 01/10] docs: scaffold trust/privacy extension repo from SEP-1913 carve Carve the schema-bearing parts of SEP-1913 into three independent experimental extensions, per @localden's "narrower first cut" ask and the Tool Annotations IG's 2026-05-28 extension-first decision: - io.modelcontextprotocol/trust-annotations (headline): narrow data-classification taxonomy (sensitive, untrusted) + open-ended evidenceRef pointer. - io.modelcontextprotocol/action-metadata: carries forward SEP-2061 (inputMetadata/returnMetadata/outcomes + requiresReview). - io.modelcontextprotocol/ifc-fides: FIDES information-flow control as a profile of evidenceRef, not a wire root. Adds repo meta (README, MAINTAINERS, CONTRIBUTING, LICENSE) and docs (sep-disposition, intent-comment, decisions, open-questions, trust-model, related-work). FIDES demoted to a profile; DataClass and requiresReview rehomed; maliciousActivityHint and propagation parked on the SEP-1913 umbrella. Citations index on public sources only. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- CONTRIBUTING.md | 42 +++++ LICENSE | 11 ++ MAINTAINERS.md | 20 +++ README.md | 97 +++++++++++- docs/decisions.md | 91 +++++++++++ docs/intent-comment.md | 69 +++++++++ docs/open-questions.md | 45 ++++++ docs/related-work.md | 38 +++++ docs/sep-disposition.md | 110 +++++++++++++ docs/trust-model.md | 53 +++++++ specification/draft/action-metadata.mdx | 136 ++++++++++++++++ specification/draft/ifc-fides.mdx | 132 ++++++++++++++++ specification/draft/trust-annotations.mdx | 179 ++++++++++++++++++++++ 13 files changed, 1022 insertions(+), 1 deletion(-) create mode 100644 CONTRIBUTING.md create mode 100644 LICENSE create mode 100644 MAINTAINERS.md create mode 100644 docs/decisions.md create mode 100644 docs/intent-comment.md create mode 100644 docs/open-questions.md create mode 100644 docs/related-work.md create mode 100644 docs/sep-disposition.md create mode 100644 docs/trust-model.md create mode 100644 specification/draft/action-metadata.mdx create mode 100644 specification/draft/ifc-fides.mdx create mode 100644 specification/draft/trust-annotations.mdx diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..3258672 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,42 @@ +# Contributing + +This repository is an incubation space for the +[Tool Annotations Interest Group](https://modelcontextprotocol.io/community/tool-annotations/charter). +We welcome proposals, schema changes, and reference implementations that +inform a future Extensions Track SEP. + +## What lives here + +- **Specification drafts** — `specification/draft/.mdx`, + one file per extension, written in the same RFC-2119 style as the core MCP + specification (per [SEP-2133](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md)). +- **Decision records** — `docs/decisions.md`. Append, do not rewrite. +- **Open questions** — `docs/open-questions.md`. + +## What does *not* live here + +- Implementation code. Reference implementations live in their own + repositories and are linked from the relevant `specification/draft/*.mdx`. +- Binding specification changes. Those are made through the + [SEP process](https://modelcontextprotocol.io/community/sep-guidelines). + +## Proposing a change to an existing extension + +1. Open a PR against `specification/draft/.mdx`. +2. Update the **Status** and **Changelog** sections in the frontmatter. +3. If the change is breaking (per the SEP-2133 definition), use a new + extension identifier and a new file. +4. Append an entry to `docs/decisions.md` if the change reflects a design + decision worth preserving. + +## Proposing a new extension + +1. Read [SEP-2133, "Experimental Extensions"](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md#experimental-extensions). +2. Open a discussion or PR proposing the identifier and scope. +3. On acceptance, add `specification/draft/.mdx` using the + frontmatter from an existing draft as a template. + +## Code of conduct + +This repository follows the +[MCP Code of Conduct](https://github.com/modelcontextprotocol/.github/blob/main/CODE_OF_CONDUCT.md). diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000..b024e54 --- /dev/null +++ b/LICENSE @@ -0,0 +1,11 @@ +Apache License +Version 2.0, January 2004 +http://www.apache.org/licenses/ + +Per SEP-2133, official MCP extensions are required to be available under the +Apache 2.0 license. Experimental extensions in this repository follow the +same convention so that contributions made here can flow into a future +official extension repository without re-licensing. + +The full text of the Apache License, Version 2.0 is available at: +https://www.apache.org/licenses/LICENSE-2.0 diff --git a/MAINTAINERS.md b/MAINTAINERS.md new file mode 100644 index 0000000..a72eb4f --- /dev/null +++ b/MAINTAINERS.md @@ -0,0 +1,20 @@ +# Maintainers + +This repository is governed by the +[Tool Annotations Interest Group](https://modelcontextprotocol.io/community/tool-annotations/charter). +Day-to-day repository maintenance follows the IG's facilitator structure. + +| Role | Name | Organization | GitHub | +| ----------- | -------------- | ------------ | ---------------------------------------------------- | +| Facilitator | Sam Morrow | GitHub | [@SamMorrowDrums](https://github.com/SamMorrowDrums) | +| Facilitator | Robert Reichel | OpenAI | [@rreichel3](https://github.com/rreichel3) | + +Per [SEP-2133](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md#experimental-extensions), +core maintainers of the modelcontextprotocol organization retain oversight, +including the ability to archive or remove this repository. + +## Per-extension maintainers + +Individual extensions may nominate additional maintainers responsible for +their specification draft and reference implementations. List them in the +`specification/draft/.mdx` frontmatter. diff --git a/README.md b/README.md index dc87aeb..4f0dbae 100644 --- a/README.md +++ b/README.md @@ -1 +1,96 @@ -# experimental-ext-tool-annotations +# Tool Annotations Interest Group — Experimental Extensions + +> ⚠️ **Experimental** — This repository is an incubation space for the +> [Tool Annotations Interest Group](https://modelcontextprotocol.io/community/tool-annotations/charter). +> Contents are exploratory drafts intended to feed future Extensions Track SEPs +> ([SEP-2133](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2133)). +> They do not represent official MCP specifications or recommendations. + +**Charter:** [modelcontextprotocol.io/community/tool-annotations/charter](https://modelcontextprotocol.io/community/tool-annotations/charter) +**Discord:** [#tool-annotations-ig](https://discord.com/channels/1358869848138059966/1482836798517543073) +**Open work:** [Pull requests](https://github.com/modelcontextprotocol/experimental-ext-tool-annotations/pulls) + +## Why split the work? + +[SEP-1913 (Trust and Sensitivity Annotations)](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913) +bundles four concerns that have proven hard to evaluate as a single unit: a +client-facing trust taxonomy, action-security metadata for tool I/O, a +malicious-activity signal, and propagation rules across session boundaries. + +The sponsor, [@localden](https://github.com/localden), asked the central +question directly in review: the SEP "adds a few schema modifications and a +thorny array-or-scalar polymorphism on enum fields. If the taxonomy turns out +to be wrong, I worry that we can't remove it or easily modify it. Can we do a +potential narrower first cut?" The subsequent design discussion converged on a +layered answer: a small, stable annotation surface on the wire, with richer +evidence kept out-of-band and referenced by a bounded pointer. + +This repo follows that steer. Each concern becomes a **separate experimental +extension** with its own [reverse-DNS identifier](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md#definition), +its own reference implementation, and its own path to a future Extensions +Track SEP. Drafts can graduate independently — directly addressing the "narrower +first cut" ask without throwing away the combinatoric value of the full set. + +See [docs/decisions.md](docs/decisions.md) for the decision record and +[docs/trust-model.md](docs/trust-model.md) for the shared enforcement model. + +## Extensions + +| Identifier | Status | What it specifies | Reference implementation(s) | +| :--- | :--- | :--- | :--- | +| [`io.modelcontextprotocol/trust-annotations`](specification/draft/trust-annotations.mdx) | Draft skeleton | **Headline.** A small, scheme-agnostic client-facing data-classification vocabulary (`sensitive`, `untrusted`) on result `_meta`, plus an optional `evidenceRef` pointer slot that carries richer payloads out-of-band. | Python SDK: [`kapil8811/mcp-trust-annotations`](https://github.com/kapil8811/mcp-trust-annotations) (138-test suite, healthcare demo, LLM usability study). | +| [`io.modelcontextprotocol/action-metadata`](specification/draft/action-metadata.mdx) | Draft skeleton | `inputMetadata` / `returnMetadata` / outcome classifiers (incl. `requires_review`) on `ToolAnnotations`, describing where inputs go, where outputs originate, and what real-world effects a tool can cause. | Carries forward [SEP-2061 (Action Security Metadata)](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061) by [@rreichel3](https://github.com/rreichel3); reference impl per that proposal (`read_drafts` / `list_inbox` / `send_email`). | +| [`io.modelcontextprotocol/ifc-fides`](specification/draft/ifc-fides.mdx) | Draft skeleton | A **profile** of the `trust-annotations` `evidenceRef` slot: `type: "ifc.fides.v1"` carrying an integrity + confidentiality label for deterministic information-flow control, following the FIDES paper ([arXiv:2505.23643](https://arxiv.org/abs/2505.23643)). | Emitter candidate: [`github-mcp-server`](https://github.com/github/github-mcp-server) (does not emit IFC labels today — closing that gap is the proof point). | + +### Why FIDES is a profile, not the headline + +An earlier sketch made information-flow control the top-level extension. That +was the wrong cut: it bakes one academic model (an integrity × confidentiality +lattice) into the namespace root and silently forecloses the other enforcement +models reviewers raised — capability tokens, caller/tool cosigning, and +sequence-shape audit records. As one reviewer put it, IFC "fits relatively well +if you use annotations" — an endorsement of IFC *as a profile*, not as the wire +root. Demoting it to a `type` value under `trust-annotations`'s open-ended +`evidenceRef` slot keeps the FIDES work first-class while leaving room for every +other model to occupy the same slot. + +## Relationship to SEP-1913 + +SEP-1913 remains the canonical place to discuss the overall problem framing. +This repo carves the schema-bearing parts of that proposal into independently +shippable pieces. When an extension here is ready to graduate, an Extensions +Track SEP can reference this repo as the prior art and the working +implementation that SEP-2133 [requires](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md#creation). + +For the full per-SEP plan — what happens to SEP-1913, SEP-2061, SEP-1862 and +others, and the SEP-2127 refactor precedent — see +[docs/sep-disposition.md](docs/sep-disposition.md). + +**Deliberately not carved here** (see [docs/open-questions.md](docs/open-questions.md)): + +- **`maliciousActivityHint`** — reviewer concerns are structural (it fires at + `tools/resolve` before execution can produce evidence; a boolean is the wrong + granularity for client UX; clients won't trust server self-attestation). If it + returns, it is per-`ContentBlock` with spans, on a different clock. Parked on + the SEP-1913 umbrella as a known cut item. +- **Propagation rules** — sensitivity escalation across session boundaries, and + the sequence-shape gap (an annotation surface for "this was call N in a + flagged sequence") remain open. Likely a future extension once the taxonomy + and `evidenceRef` shape are stable. + +## Repository layout + +This repo mirrors the structure of official extension repositories such as +[`ext-auth`](https://github.com/modelcontextprotocol/ext-auth): + +``` +specification/draft/.mdx # one spec per extension +docs/ # decision log, open questions, related work +MAINTAINERS.md # IG facilitators +``` + +## Contributing + +See [CONTRIBUTING.md](CONTRIBUTING.md). Substantive design discussion happens +on PRs against the relevant `specification/draft/*.mdx` file, in the IG +Discord, and (for cross-extension concerns) on [SEP-1913](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913). diff --git a/docs/decisions.md b/docs/decisions.md new file mode 100644 index 0000000..ff4e23b --- /dev/null +++ b/docs/decisions.md @@ -0,0 +1,91 @@ +# Decision log + +Append-only record of design decisions for the Tool Annotations IG's trust / +privacy extension work. Newest at the bottom. + +## 2026-06-10 — Carve SEP-1913 into independent extensions + +**Decision.** Split the schema-bearing parts of SEP-1913 into separate +experimental extensions, each with its own `io.modelcontextprotocol/…` +identifier and reference implementation, rather than pursuing one broad +Standards Track SEP. + +**Rationale.** @localden's review asked for a *narrower first cut*; the IG +[aligned 2026-05-28](https://github.com/modelcontextprotocol/modelcontextprotocol/discussions/2820) +on an extension-first strategy. Independent extensions can graduate on their own +clock and avoid hard-to-remove schema. + +## 2026-06-10 — Three initial extensions + +**Decision.** `trust-annotations` (headline), `action-metadata`, `ifc-fides`. + +**Rationale.** These are the three pieces with either a reference implementation +or an existing SEP behind them: Kapil's SDK, SEP-2061 (Reichel), and the FIDES +model respectively. + +## 2026-06-10 — FIDES is a profile, not the headline + +**Decision.** Information-flow control is `type: "ifc.fides.v1"`, a profile of +the `trust-annotations` `evidenceRef` slot — not a top-level `io.modelcontextprotocol/ifc` +extension. + +**Rationale.** IFC is one enforcement model among several raised in review +(capability tokens — pshkv; cosigning — viftode4; sequence shape — marras0914). +A top-level `ifc/` root would foreclose those. Reviewer endorsement was for IFC +"if you use annotations" — i.e. as a profile. + +## 2026-06-10 — `evidenceRef.type` is an open string + +**Decision.** `type` MUST remain an open string with a non-binding registry of +well-known values; never a closed enum. Required fields are `digest` and +`canonicalization`; `schema` recommended; `ref` optional. + +**Rationale.** Adapted from the vaaraio / Rul1an convergence in the SEP-1913 +thread. An open `type` is what lets IFC, data-class, sequence-shape, and +attestation profiles share one slot. + +## 2026-06-10 — `requiresReview` moves to `action-metadata` + +**Decision.** `requiresReview` is an `action-metadata` field, not a +`trust-annotations` field. + +**Rationale.** It is a workflow/consent signal, not a data-classification +property. Keeping it out of the trust taxonomy avoids reproducing SEP-1913's +"several concerns in one schema" problem at smaller scale. + +## 2026-06-10 — DataClass demoted to a profile + +**Decision.** The wire taxonomy keeps only the coarse `sensitive` boolean. +The four-level classification + regulatory scope becomes an `evidenceRef` +profile `type: "data-class.v1"`. + +**Rationale.** Coarse binary is universally client-actionable and cheap on the +wire; the richer taxonomy can evolve behind a profile without a breaking schema +change. + +## 2026-06-10 — Parked: maliciousActivityHint, propagation rules + +**Decision.** Neither is carved into an extension now; both stay on the SEP-1913 +umbrella. + +**Rationale.** `maliciousActivityHint` has unresolved structural objections +(fires pre-execution at `tools/resolve`; boolean granularity wrong for UX; +clients won't trust server self-attestation). Propagation/sequence-shape needs +the taxonomy and `evidenceRef` stable first. + +## 2026-06-10 — Citations: public sources only + +**Decision.** Reference implementations and motivating examples cite **public** +artifacts — [`github-mcp-server`](https://github.com/github/github-mcp-server), +[`kapil8811/mcp-trust-annotations`](https://github.com/kapil8811/mcp-trust-annotations), +[arXiv:2505.23643](https://arxiv.org/abs/2505.23643) — and index on the public +SEP-1913 review record (esp. @localden). Private/internal implementations are +not named or linked. + +## 2026-06-10 — Pre-flight (SEP-1862) stays core + +**Decision.** These extensions are response-level and do not depend on Tool +Resolution. SEP-1862 remains a core/Standards-Track protocol change. + +**Rationale.** The 2026-05-28 IG meeting concluded pre-flight is inherently a +protocol-level change, not an extension. diff --git a/docs/intent-comment.md b/docs/intent-comment.md new file mode 100644 index 0000000..319ce77 --- /dev/null +++ b/docs/intent-comment.md @@ -0,0 +1,69 @@ +# Intent comment (draft, pre-post review) + +This is the comment we plan to post on +[SEP-1913](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913), +with an abbreviated pointer version for +[SEP-2061](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061). +Kept here so it can be reviewed and stays in sync with +[sep-disposition.md](./sep-disposition.md). + +--- + +## For SEP-1913 + +> **Intent: split this SEP and migrate to the Extensions Track** +> +> A note on direction for everyone following this thread. When SEP-1913 was +> first framed, the **Extensions Track** (SEP-2133) and the `experimental-ext-*` +> incubation process didn't exist in their current form. They now do, and +> they're a better fit for this work than a single Standards Track SEP. +> +> Two things pushed us here: +> - @localden's review ask for a **narrower first cut** — the concern that a +> broad taxonomy with array-or-scalar polymorphism is hard to remove or change +> once it lands. +> - The Tool Annotations IG's +> [May 28 decision](https://github.com/modelcontextprotocol/modelcontextprotocol/discussions/2820) +> to pursue trust/privacy as an **experimental extension first**, gather +> adoption evidence, then ask core maintainers to absorb anything. +> +> So the plan is to **carve this proposal into a few small, +> independently-shippable extensions**, each with its own +> `io.modelcontextprotocol/…` identifier, reference implementation, and path to +> an Extensions Track SEP. Incubation is in +> [`experimental-ext-tool-annotations`](https://github.com/modelcontextprotocol/experimental-ext-tool-annotations): +> +> | Extension | Scope | +> |---|---| +> | `trust-annotations` | The narrow data-classification taxonomy (`sensitive`, `untrusted`) + an open-ended `evidenceRef` pointer for richer, out-of-band evidence. | +> | `action-metadata` | Tool I/O + outcome contract (folds in @rreichel3's SEP-2061). | +> | `ifc-fides` | Information-flow control ([arXiv:2505.23643](https://arxiv.org/abs/2505.23643)) as **one profile** of `evidenceRef`, not a wire root — the public/private-repo confidentiality case, with github-mcp-server as emitter candidate. | +> +> Deliberately **not** in the initial carve: `maliciousActivityHint` (the +> structural concerns raised here are unresolved) and session-level propagation +> rules. Those stay parked on this umbrella thread. +> +> This follows the same Standards-Track → Extensions-Track refactor pattern as +> SEP-2127 (#2893). This PR stays open as the umbrella / problem-framing thread; +> the schema-bearing pieces move out. Feedback on the carve is very welcome — +> particularly on whether any parked item deserves its own extension sooner. + +--- + +## For SEP-2061 (abbreviated pointer) + +> Cross-linking for visibility: the Tool Annotations IG is carving the trust / +> privacy / action-metadata work into small experimental extensions in +> [`experimental-ext-tool-annotations`](https://github.com/modelcontextprotocol/experimental-ext-tool-annotations) +> (background: SEP-1913 comment +> [here](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913), +> and the [May 28 IG decision](https://github.com/modelcontextprotocol/modelcontextprotocol/discussions/2820)). +> +> This proposal maps directly onto the +> [`action-metadata`](https://github.com/modelcontextprotocol/experimental-ext-tool-annotations/blob/main/specification/draft/action-metadata.mdx) +> extension — `inputMetadata` / `returnMetadata` / outcomes, plus a +> `requiresReview` signal we pulled out of the trust taxonomy because it's a +> workflow concern, not a data property. The intent is to carry SEP-2061 forward +> as that extension rather than run a parallel proposal. @rreichel3 — flagging +> so we co-own it rather than diverge; happy to keep this thread as the home for +> the field semantics. diff --git a/docs/open-questions.md b/docs/open-questions.md new file mode 100644 index 0000000..c4747c7 --- /dev/null +++ b/docs/open-questions.md @@ -0,0 +1,45 @@ +# Open questions + +Tracked here rather than in the spec drafts, so the drafts stay non-temporal. + +## Cross-cutting + +- **Where does the policy-enforcement engine live** across different user + universes (cross-org, cross-domain)? Engines work well within one universe; + cross-domain is the hard case. (IG 2026-05-28.) +- **Cross-domain integrity verification** — is asymmetric crypto for domain + identity in scope for a future extension, or out of scope entirely? CLI tools + remain a persistent gap for enforcing these constraints. +- **`evidenceRef.type` registry** — who curates the list of well-known profile + types, and how do we coordinate with attestation SEPs (e.g. SEP-2787) so + values don't collide? + +## trust-annotations + +- Is `sensitive` the right single coarse signal, or do we need the + `data-class.v1` profile from day one? +- Content-block-level vs. result-level attachment — does the draft need a + worked multi-result example before it's implementable? +- `list_changed`: confirmed response-level annotations don't participate; revisit + only if trust vocabulary ever attaches to tool definitions. + +## action-metadata + +- Coexistence vs. replacement of legacy `destructiveHint` / `readOnlyHint` / + `idempotentHint` / `openWorldHint`. +- Open strings vs. closed enums for `destination` / `source` / `sensitivity`. +- Does `requiresReview` need a machine-readable *reason* for good client UX? + +## ifc-fides + +- Inline `_meta.ifc` for low-friction adoption vs. always behind `evidenceRef`. +- GitHub Enterprise `internal` repo visibility → public/private/reader-set + mapping (audience is the whole org, broader than collaborators). + +## Parked (SEP-1913 umbrella, not carved) + +- **`maliciousActivityHint`** — if it returns, it is per-`ContentBlock` with + spans, driven by the host's own detection, not a server-attested boolean. +- **Session-level propagation rules** — escalation semantics and the + sequence-shape gap ("this was call N in a flagged sequence" has no response + annotation surface today). diff --git a/docs/related-work.md b/docs/related-work.md new file mode 100644 index 0000000..c8c6e16 --- /dev/null +++ b/docs/related-work.md @@ -0,0 +1,38 @@ +# Related work + +External references and prior art relevant to the IG's trust / privacy +annotation work. Several were surfaced in IG meetings (notably 2026-05-28). + +## SEPs + +- [SEP-1913 — Trust and Sensitivity Annotations](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913) — the umbrella this work carves from. +- [SEP-2061 — Action Security Metadata](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061) — carried forward as `action-metadata`. +- [SEP-1862 — Tool Resolution / pre-flight checks](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1862) — core-protocol, composes with these extensions. +- [SEP-2133 — Extensions](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md) — the framework this repo incubates under. +- [SEP-2127 — Server Cards](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2893) — precedent for the Standards→Extensions Track refactor. +- [SEP-2787 — Tool Call Attestation](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2787) — candidate `evidenceRef` profile. + +## Research + +- **FIDES** — *Information-flow control for LLM agents.* [arXiv:2505.23643](https://arxiv.org/abs/2505.23643). Basis for the `ifc.fides.v1` profile. +- **Design Patterns for Securing LLM Agents** — IBM/Google/Microsoft. [arXiv:2506.08837](https://arxiv.org/abs/2506.08837). Plan-Then-Execute, Dual LLM, Map-Reduce, etc. +- **Trail of Bits** — prompt-injection via hidden content in GitHub issues. [blog](https://blog.trailofbits.com/2025/08/06/prompt-injection-engineering-for-attackers-exploiting-github-copilot/). +- **OpenAI Auto Review** — https://alignment.openai.com/auto-review/ (shared in IG chat). + +## Implementations & tooling + +- [`kapil8811/mcp-trust-annotations`](https://github.com/kapil8811/mcp-trust-annotations) — reference Python SDK PoC for `trust-annotations`. +- [`github-mcp-server`](https://github.com/github/github-mcp-server) — public MCP server; emitter candidate for `ifc-fides` (knows repo visibility + collaborators). +- **Ethyca** data-labeling docs — https://www.ethyca.com/docs (shared in IG chat). +- **GitHub Next** agentic-workflows research on data labeling — to be documented as issues in this repo (IG action item, @gokhanarkan / @joannakl). + +## Adjacent community proposals (from the SEP-1913 thread) + +- **SINT Protocol** (capability-token constraint enforcement) — pshkv. +- **in-toto** attestations as a trust-annotation substrate. +- **OVERT 1.0** envelope shape for runtime evidence. +- Caller/tool **cosigning** model — viftode4. +- **Sequence-shape** policies — marras0914. + +These are exactly the models that `evidenceRef`'s open `type` is designed to +accommodate as profiles. diff --git a/docs/sep-disposition.md b/docs/sep-disposition.md new file mode 100644 index 0000000..f930340 --- /dev/null +++ b/docs/sep-disposition.md @@ -0,0 +1,110 @@ +# SEP disposition: what happens to the existing proposals + +This document explains how the existing trust/privacy/annotation SEPs map onto +the experimental extensions incubated in this repository, and what is proposed +to happen to each SEP. It exists so that anyone arriving from one of those PRs +can understand the plan without reading the whole thread. + +> **Status:** proposal / options. Nothing here is decided until reflected in the +> relevant SEP PRs. The migration follows the precedent set by +> [SEP-2127 → Extensions Track (#2893)](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2893). + +## Why anything changes + +When [SEP-1913](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913) +was first framed, the **Extensions Track** ([SEP-2133](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md)) +and the `experimental-ext-*` incubation process did not exist in their current +form. Two inputs since then point at a different shape: + +1. **Sponsor steer.** [@localden](https://github.com/localden) asked for a + *narrower first cut* — a single broad taxonomy with array-or-scalar enum + polymorphism is hard to remove or change once shipped. +2. **IG decision.** The Tool Annotations IG + [aligned on 2026-05-28](https://github.com/modelcontextprotocol/modelcontextprotocol/discussions/2820) + to pursue this work as an **experimental extension first**, build an + adoption/evidence base, and only then ask core maintainers to absorb + anything into the protocol. + +The result: carve the schema-bearing parts into small, independent extensions, +each able to graduate on its own clock. + +## The precedent: SEP-2127 (Server Cards) + +[#2893](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2893) +refactored SEP-2127 from **Standards Track** to **Extensions Track**: + +- Frontmatter `Type: Standards Track` → `Type: Extensions Track`, plus an + `Extension Identifier` and "on behalf of the WG" attribution. +- A top-of-file `` pointing at the experimental repo as the spec home. +- The SEP body **slimmed to a charter** — Abstract, Motivation, Rationale, a + high-level Specification *pointer*, security posture *summary* — with the + detailed normative wire format delegated to the extension repo. +- The SEP body kept **non-temporal**: a published SEP is frozen, so in-flight + "open items" live in the PR description and extension-repo issues, not in the + SEP text. + +We apply the same playbook below. + +## Per-SEP disposition + +### SEP-1913 — Trust and Sensitivity Annotations + +**Proposed:** becomes the **umbrella / problem-framing** thread. The schema- +bearing content is carved out into the extensions below. Options, in order of +preference: + +- **(A, preferred)** Keep the PR open as the framing umbrella; add an intent + comment (see [below](#intent-comment)); later, either refactor it to an + Extensions Track *charter* that points here (SEP-2127 shape) **or** close it + in favor of per-extension Extensions Track SEPs once those are ready. +- **(B)** Refactor 1913 itself into the `trust-annotations` Extensions Track + SEP and spin the others off as siblings. +- **(C)** Close 1913 outright and open three fresh Extensions Track SEPs. Loses + the discussion history's continuity; not preferred. + +**Carved out:** `trust-annotations`, `action-metadata`, `ifc-fides`. +**Parked on the umbrella (not carved):** `maliciousActivityHint`, +session-level propagation rules. See [open-questions.md](./open-questions.md). + +### SEP-2061 — Action Security Metadata + +**Proposed:** becomes the [`action-metadata`](../specification/draft/action-metadata.mdx) +extension. SEP-2061 is by [@rreichel3](https://github.com/rreichel3), who is +also an IG co-facilitator and SEP-1913 co-author, so this is a fold-in, not a +collision. Disposition mirrors 1913 option (A): keep the thread as the field- +semantics discussion, add a pointer comment linking it to the +`action-metadata` carve, refactor to Extensions Track when ready. + +### SEP-1862 — Tool Resolution (pre-flight checks) + +**Proposed:** **stays Standards Track / core.** The 2026-05-28 IG meeting +concluded pre-flight checks are inherently a protocol-level change, not an +extension. These extensions are deliberately **response-level** (`_meta` on +results, static `ToolAnnotations`) and do **not** depend on 1862. They compose +with it if it lands, but do not block on it. + +### Other related SEPs (not owned here) + +- **SEP-1984 (Comprehensive Tool Annotations)**, **SEP-2417 (Model Preferences + for Tools)** — tracked by the IG as discussion items; not part of this carve. + Cross-link only. +- **SEP-2787 (Tool Call Attestation)** and the various attestation/evidence + threads — these are natural `evidenceRef` *profile* candidates rather than + competitors. Coordinate so the `evidenceRef.type` registry can list them. + +## Mapping table + +| SEP | Title | Proposed disposition | Extension home | +| :-- | :-- | :-- | :-- | +| 1913 | Trust & Sensitivity Annotations | Umbrella thread; carve schema out | `trust-annotations` (+ `ifc-fides`) | +| 2061 | Action Security Metadata | Fold into extension | `action-metadata` | +| 1862 | Tool Resolution (pre-flight) | Stays core / Standards Track | — (composes, no dependency) | +| 1984 | Comprehensive Tool Annotations | IG discussion item | — | +| 2417 | Model Preferences for Tools | IG discussion item | — | +| 2787 | Tool Call Attestation | Candidate `evidenceRef` profile | (future) | + +## Intent comment + +The text we plan to post on SEP-1913 (and, abbreviated, on SEP-2061) lives in +[intent-comment.md](./intent-comment.md) so it can be reviewed before posting +and kept in sync with this document. diff --git a/docs/trust-model.md b/docs/trust-model.md new file mode 100644 index 0000000..98774e4 --- /dev/null +++ b/docs/trust-model.md @@ -0,0 +1,53 @@ +# Trust model + +A single statement of the enforcement model shared by all extensions in this +repository, so individual specs don't re-litigate it. + +## Annotations are claims, not guarantees + +An annotation on a tool result or definition is a **claim** made by whoever +produced it. Nothing in these extensions assumes the producer is honest or +competent. The value of an annotation comes from two things: + +1. **Verifiability.** Where a claim needs to be trusted, the `evidenceRef` + pointer (see [`trust-annotations`](../specification/draft/trust-annotations.mdx)) + lets a consumer resolve and check the evidence behind it — re-hash the + referenced record, verify a signature, check an inclusion proof — rather + than taking the claim on faith. +2. **Accountability.** Enforcement lives with the parties that can impose + consequences: **registries and marketplaces** that admit servers, + **hosts/clients** that gate actions, and **operators** that set policy. The + annotation gives those parties a machine-readable surface to act on. + +This mirrors the framing repeated throughout the SEP-1913 discussion: trust +comes from the ecosystem verifying annotations, not from developer good faith. +A server that lies in its annotations is a server a registry can refuse to list +and a host can refuse to trust — the same accountability model as any other +declared capability. + +## Defense in depth, not a single gate + +As framed in the IG's inaugural meeting, these annotations are **defense in +depth**: they reduce the likelihood of unintended actions (unnecessary +destructive operations, data crossing a boundary it shouldn't), they don't +claim to be a complete security boundary. A host SHOULD combine them with its +own checks (its own injection detection, its own policy engine) rather than +treating any single annotation as authoritative. + +## Human-in-the-loop on the risky edges + +A recurring pattern from the IG research (Joanna's "phases" work, and the +SEP-1913 thread): rather than blanket-blocking flows a policy engine is unsure +about, **flag the specific call for user confirmation**. This preserves utility +while keeping a human on the genuinely risky edges, and is the recommended +default for `requiresReview` ([`action-metadata`](../specification/draft/action-metadata.mdx)) +and for IFC policy violations ([`ifc-fides`](../specification/draft/ifc-fides.mdx)). + +## Cross-domain is the hard case + +Policy engines work well inside a single user/organization universe and +struggle across universes (cross-org, cross-domain flows). These extensions +provide the *signal* (where data came from, how sensitive it is, what a tool +does with it); they do not solve cross-domain enforcement on their own. That +remains an [open question](./open-questions.md) and a likely area for future +work (e.g. asymmetric crypto for domain integrity verification). diff --git a/specification/draft/action-metadata.mdx b/specification/draft/action-metadata.mdx new file mode 100644 index 0000000..766f3c2 --- /dev/null +++ b/specification/draft/action-metadata.mdx @@ -0,0 +1,136 @@ +--- +title: Action Metadata +extension_identifier: io.modelcontextprotocol/action-metadata +status: Draft (experimental) +maintainers: + - "@rreichel3" + - "@SamMorrowDrums" +related: + - https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061 + - https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913 +--- + +> ⚠️ **Experimental draft skeleton.** This carries forward +> [SEP-2061: Action Security Metadata](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061) +> by [@rreichel3](https://github.com/rreichel3) into the IG's experimental repo, +> per the May 28 2026 decision to pursue trust/privacy work as an extension +> first. SEP-2061 remains the canonical discussion thread for the field +> semantics. + +## Abstract + +This extension adds a small, declarative contract to a tool's static +`ToolAnnotations` describing **what the tool does with data**: where inputs may +go, where outputs originate, and what real-world outcome invoking it can cause. +Where [`trust-annotations`](./trust-annotations.mdx) classifies *data in +transit*, this extension classifies *tool behavior*. The two are complementary +and independently adoptable — a client can consume action metadata without +implementing trust annotations at all. + +## Motivation + +MCP today treats all tool calls as equivalent at the protocol level beyond the +coarse `readOnlyHint` / `destructiveHint` / `idempotentHint` / `openWorldHint` +hints. A tool that reads drafts and a tool that sends email are otherwise +indistinguishable, even though their privacy and consent implications differ +radically. Runtimes fall back to inferring risk from tool names or model +behavior, which does not scale. + +This was reinforced in the May 28 2026 IG meeting: a model often **cannot tell +whether a target is private or public**, and absent that signal it may push +content somewhere it should not. A declarative behavioral contract lets clients +and models make safer decisions without baking domain knowledge into every +model. + +The canonical worked example from SEP-2061: `read_drafts`, `list_inbox`, and +`send_email` can share an identical JSON Schema yet have completely different +security semantics — only action metadata distinguishes them. + +## Specification + +### Dependencies + +This extension annotates the existing `ToolAnnotations` object returned by +`tools/list`. It has no dependency on `trust-annotations` or on Tool Resolution. + +### Fields + +Carried under the extension-namespaced key on `ToolAnnotations`: + +```jsonc +{ + "annotations": { + "io.modelcontextprotocol/action-metadata": { + "inputMetadata": { + "destination": "external", // where input data may be stored/sent + "sensitivity": "personal" // kind of data the tool accepts + }, + "returnMetadata": { + "source": "open-world", // where returned data originates + "sensitivity": "public" + }, + "outcome": "consequential", // benign | consequential | irreversible + "requiresReview": true // host SHOULD seek human confirmation + } + } +} +``` + +| Field | Meaning | +| :--- | :--- | +| `inputMetadata.destination` | Where data passed to the tool may end up (e.g. `local`, `internal`, `external`). | +| `inputMetadata.sensitivity` | The kind of data the tool is designed to accept. | +| `returnMetadata.source` | Where the tool's returned data originates (e.g. `first-party`, `open-world`). | +| `returnMetadata.sensitivity` | The kind of data the tool is designed to return. | +| `outcome` | Real-world effect class: `benign` / `consequential` / `irreversible`. | +| `requiresReview` | The tool author signals that a host SHOULD obtain explicit human confirmation before invocation. | + +> Exact enum value sets are inherited from SEP-2061 and are **not** re-litigated +> here; this draft tracks that proposal. Where SEP-2061 evolves, this file +> follows. + +### `requiresReview` lives here, deliberately + +`requiresReview` is a **workflow/consent** signal, not a data-classification +property. It was intentionally moved out of [`trust-annotations`](./trust-annotations.mdx) +(which stays strictly data-classifying) to avoid reproducing SEP-1913's +"several concerns in one schema" problem at smaller scale. It sits next to +`outcome` because both describe the *act* of calling the tool rather than the +*data* in flight. + +### Lifecycle and `list_changed` + +These fields are part of the **tool definition** (`ToolAnnotations`). They are +therefore covered by `tools/list_changed`: a server that changes a tool's +action metadata MUST emit `list_changed` as it would for any tool-definition +change. (This is the opposite of `trust-annotations`, which is response-level.) + +## Relationship to existing annotations + +`outcome: irreversible` overlaps conceptually with `destructiveHint` but is +strictly richer (a three-way classification vs. a boolean) and is scoped to the +real-world effect rather than to whether the operation is destructive to +server-side state. The IG will need to decide whether action metadata +*supersedes* or *coexists with* the legacy hints before any graduation. + +## Reference implementation + +Per [SEP-2061](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061): +the `read_drafts` / `list_inbox` / `send_email` worked example with identical +schemas and divergent action metadata. A public MCP server emitting these +fields (candidate: [`github-mcp-server`](https://github.com/github/github-mcp-server)) +would anchor the draft in a real ecosystem. + +## Open questions + +- Coexistence vs. replacement of `destructiveHint` / `readOnlyHint`. +- Whether `destination` / `source` / `sensitivity` enums should be open strings + (consistent with `evidenceRef.type`) or closed enums. +- Whether `requiresReview` needs a machine-readable *reason* (vs. a bare + boolean) to drive good client UX. + +## Changelog + +| Date | Change | +| ---------- | ------------------------------------------------------------------- | +| 2026-06-10 | Initial draft skeleton, carrying SEP-2061 into the experimental repo; absorbed `requiresReview` from the trust taxonomy. | diff --git a/specification/draft/ifc-fides.mdx b/specification/draft/ifc-fides.mdx new file mode 100644 index 0000000..5895d0a --- /dev/null +++ b/specification/draft/ifc-fides.mdx @@ -0,0 +1,132 @@ +--- +title: Information-Flow Control (FIDES profile) +extension_identifier: io.modelcontextprotocol/ifc-fides +status: Draft (experimental) +maintainers: + - "@SamMorrowDrums" +profile_of: io.modelcontextprotocol/trust-annotations +related: + - https://arxiv.org/abs/2505.23643 + - https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913 +--- + +> ⚠️ **Experimental draft skeleton.** This defines a *profile* of the +> [`trust-annotations`](./trust-annotations.mdx) `evidenceRef` slot. It is **not** +> a standalone wire root — see [Why a profile](#why-a-profile). + +## Abstract + +This extension defines `ifc.fides.v1`, a profile of the `trust-annotations` +`evidenceRef` slot that carries an **information-flow-control label** — +integrity plus confidentiality — following the FIDES model +([arXiv:2505.23643](https://arxiv.org/abs/2505.23643)). A host that implements +deterministic information-flow control can consume these labels to decide +whether a tool call is permitted, without baking the IFC model into the core +protocol or into the `trust-annotations` wire surface. + +## Why a profile + +An earlier sketch proposed information-flow control as a top-level extension +(`io.modelcontextprotocol/ifc`). That was the wrong cut. IFC is **one** +enforcement model among several that reviewers of SEP-1913 raised — capability +tokens, caller/tool cosigning, and sequence-shape audit records were all put +forward. Making the FIDES integrity × confidentiality lattice the namespace +root would silently foreclose those. + +Carrying it as a `type` value under the open-ended `evidenceRef` slot keeps the +FIDES work first-class while leaving the slot free for every other model. One +reviewer's framing captured it: IFC "fits relatively well *if you use +annotations*" — an endorsement of IFC as a profile, not as the wire root. + +## Motivation + +The motivating case is the one raised in the +[2026-05-28 IG meeting](https://github.com/modelcontextprotocol/modelcontextprotocol/discussions/2820): +**a model often cannot tell whether a repository is public or private**, and +lacking that signal it may push private content to a public destination. An IFC +label lets the host track confidentiality (who may read this data) and +integrity (is this data trusted) as context accumulates across tool calls, and +deny or prompt before a flow violates policy. + +A public MCP server is the natural emitter. [`github-mcp-server`](https://github.com/github/github-mcp-server) +returns repository data whose confidentiality is determined by repo visibility +and collaborator sets — exactly the public/private signal above — but does +**not** emit IFC labels today. Closing that emitter gap is the concrete proof +point for this profile: a host-side consumer of the label shape already exists, +so the missing half is a server willing to emit it. + +## Specification + +### Profile identity + +This profile is selected by `evidenceRef.type == "ifc.fides.v1"` on a +`trust-annotations` annotation. A client that does not implement IFC MUST be +able to ignore it safely (the surrounding `sensitive` / `untrusted` booleans and +the `digest`/`canonicalization` pair remain meaningful). + +### Label payload + +The record referenced by the `evidenceRef` (and, for low-friction adoption, MAY +be inlined by deployments that accept the wire cost) has the shape: + +```jsonc +{ + "integrity": "trusted", // "trusted" | "untrusted" (FIDES §4.1 two-level lattice) + "confidentiality": "public" // "public" | "private" | ["login1","login2", …] +} +``` + +| Field | Meaning | +| :--- | :--- | +| `integrity` | Two-level integrity lattice (`trusted` ⊑ `untrusted`): trusted data may flow to untrusted sinks, not vice versa. | +| `confidentiality` | `"public"` = world-readable; `"private"` = an opaque marker the host resolves to a reader set (e.g. repo collaborators); an explicit `string[]` = pre-resolved reader logins. Fewer readers = more confidential = higher in the lattice. | + +### Label semantics + +- **Join on accumulation.** As a session ingests labeled results, the context + label is the *join* of what it has seen: integrity degrades toward + `untrusted`, confidentiality narrows toward the smallest reader set. +- **Policy check before egress.** Before a write/egress tool call, the host + checks whether the current context label may flow to the call's target. When + a label is absent, the host falls back to its default (trusted-action) + policy rather than assuming the worst — labels are an *additive* signal. +- **Confidentiality resolution.** `"private"` is intentionally opaque on the + wire; resolving it to a concrete reader set (e.g. via a collaborators lookup) + is a host concern, so servers need not enumerate audiences inline. + +> The normative integrity/confidentiality lattice definitions follow the FIDES +> paper, §4.1 and §4.3. This draft references the model rather than restating +> the proofs. + +### Relationship to `trust-annotations` + +`ifc-fides` never appears without a host `trust-annotations` annotation +carrying the `evidenceRef`. The booleans are the universally-actionable signal; +the IFC label is the precise, host-checkable evidence behind them. + +## Reference implementation + +- **Consumer:** a host-side IFC engine that parses the `{integrity, + confidentiality}` label, maintains a context label across tool results, and + applies a flow policy before egress operations already exists in practice. + (Linked once a public reference is available.) +- **Emitter (gap / proof point):** [`github-mcp-server`](https://github.com/github/github-mcp-server) + is the candidate — it already knows repo visibility and collaborator sets, + which are exactly the confidentiality inputs. + +## Open questions + +- Should the label be inlinable on `_meta.ifc` directly for low-friction + adoption, or always behind `evidenceRef` for schema minimalism? (Lean: + permit both; `evidenceRef` is canonical, inline is a convenience.) +- How does GitHub Enterprise `internal` repo visibility map onto the + public/private/reader-set confidentiality model? (Audience is the whole org, + strictly broader than collaborators — likely falls back to default policy.) +- Registry coordination with other attestation/evidence profiles + (e.g. SEP-2787) so `evidenceRef.type` values don't collide. + +## Changelog + +| Date | Change | +| ---------- | ------------------------------------------------------------ | +| 2026-06-10 | Initial draft skeleton. Reframed from a top-level `ifc` extension to a `trust-annotations` `evidenceRef` profile (`ifc.fides.v1`). | diff --git a/specification/draft/trust-annotations.mdx b/specification/draft/trust-annotations.mdx new file mode 100644 index 0000000..d43200c --- /dev/null +++ b/specification/draft/trust-annotations.mdx @@ -0,0 +1,179 @@ +--- +title: Trust Annotations +extension_identifier: io.modelcontextprotocol/trust-annotations +status: Draft (experimental) +maintainers: + - "@SamMorrowDrums" + - "@rreichel3" +related: + - https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913 + - https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md +--- + +> ⚠️ **Experimental draft skeleton.** This document captures the agreed shape +> and the open questions. Normative text is intentionally thin pending +> reference-implementation validation. Substantive discussion happens on PRs +> against this file and on [SEP-1913](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913). + +## Abstract + +This extension defines a small, stable, scheme-agnostic vocabulary for +classifying **data in transit** through MCP tool results, plus an optional +`evidenceRef` pointer that lets a deployment attach richer, out-of-band +evidence without growing the on-wire schema. It is the headline extension of +the Tool Annotations IG's trust work; other extensions (information-flow +control, action metadata) compose with it rather than duplicating it. + +The design follows the [@localden](https://github.com/localden) review steer on +SEP-1913 — take a *narrow first cut* of the taxonomy and avoid hard-to-remove +schema — while preserving the layered "small annotation on the wire, rich +evidence out-of-band" consensus that emerged in the SEP-1913 thread. + +## Motivation + +Data crosses tool boundaries today with no standardized markers for whether it +is sensitive or whether it originated from an untrusted source. Clients and +hosts are left to infer this from tool names or model behavior. Two coarse, +broadly-applicable signals cover the majority of client-actionable cases: + +- **`sensitive`** — the content should be treated as confidential (PII, + credentials, proprietary data). Drives consent prompts and egress policy. +- **`untrusted`** — the content originated from an open-world / attacker- + influenceable source (web pages, third-party email, user-generated content). + Drives prompt-injection defenses. + +Anything richer than these two booleans is deliberately **not** on the wire; it +hangs off `evidenceRef` (see below). + +## Specification + +### Dependencies + +This extension depends only on the base MCP `_meta` mechanism. It does not +require Tool Resolution ([SEP-1862](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1862)), +though it composes with it. + +### Annotation shape + +Trust annotations are carried under the extension-namespaced `_meta` key on a +`CallToolResult` (and MAY appear on an individual `ContentBlock` — see +[Attachment point](#attachment-point)). + +```jsonc +{ + "_meta": { + "io.modelcontextprotocol/trust-annotations": { + "sensitive": true, // optional boolean + "untrusted": true, // optional boolean + "evidenceRef": { // optional pointer, see below + "type": "data-class.v1", + "digest": "sha256:…", + "canonicalization": "cbor/rfc8949", + "schema": "https://…/data-class.v1.json", + "ref": "audit://…" // optional locator + } + } + } +} +``` + +Both booleans are optional and default to `false`/absent. Absence MUST be +treated as "no claim made," never as "asserted false." + +### The `evidenceRef` slot + +`evidenceRef` is the extension point that keeps the wire schema small while +letting deployments attach arbitrarily rich evidence. Its shape (adapted from +the SEP-1913 discussion): + +| Field | Required | Meaning | +| :--- | :--- | :--- | +| `type` | yes | **Open string** naming the class of the referenced record. NOT an enum. Examples: `"data-class.v1"`, `"ifc.fides.v1"`, `"sequence"`, `"policy-decision"`. | +| `digest` | yes | Hash of the referenced record, so a client holding the data can re-derive it. | +| `canonicalization` | yes | How the digest was computed (e.g. `"cbor/rfc8949"`), so the client can re-hash independently. | +| `schema` | recommended | Identifier/version of the record `ref` resolves to. | +| `ref` | optional | Locator into the deployment's audit/evidence stream. | + +> **Normative intent:** `type` MUST remain an open string. Narrowing it to a +> closed enum would foreclose the capability-token, cosigning, and +> sequence-shape models raised in SEP-1913 review. A non-binding **registry** +> of well-known `type` values is maintained in this repo; unknown `type` values +> MUST be safely ignorable by a client that does not understand them (the +> `digest`/`canonicalization` pair is still a usable, bounded signal). + +This single slot subsumes the previously separate `attestationChainRef` / +`policyDecisionRef` ideas — both become `type` values. + +### Coarse vs. rich classification (DataClass) + +SEP-1913 carried a four-level data classification +(`public` / `personal` / `confidential` / `highly_confidential`) plus a +regulatory scope (e.g. `confidential:hipaa`). **This extension deliberately +keeps only the coarse `sensitive` boolean on the wire.** Richer classification +is recovered as an `evidenceRef` profile: + +```jsonc +"evidenceRef": { + "type": "data-class.v1", + // record resolves to e.g. { "class": "highly_confidential", "regulatory": ["hipaa"] } + "digest": "sha256:…", + "canonicalization": "cbor/rfc8949" +} +``` + +This is an explicit scope decision: the binary lives on the wire for universal +client actionability; the taxonomy lives behind a profile so it can evolve +without a breaking schema change. + +### Attachment point + +Annotations attach at the **`CallToolResult` level by default.** A server MAY +additionally annotate an individual `ContentBlock` when it has reason to +localize the signal (e.g. one search result among many is untrusted). When both +are present, the content-block annotation refines the result-level one for that +block; it MUST NOT *weaken* a result-level claim (union semantics — once +`true`, stays `true`). + +### Lifecycle and `list_changed` + +Trust annotations defined here are **response-level**: they describe a specific +tool result and are not part of the tool *definition*. They therefore do **not** +participate in `tools/list_changed`. (If a future revision attaches trust +vocabulary to tool definitions, that surface would follow `list_changed`; this +draft does not.) + +### Propagation + +This extension does **not** specify session-level escalation/propagation rules. +Those remain an [open question](../../docs/open-questions.md) on the SEP-1913 +umbrella. A host MAY implement propagation locally; this extension only +standardizes the per-result annotation. + +## Reference implementation + +[`kapil8811/mcp-trust-annotations`](https://github.com/kapil8811/mcp-trust-annotations) +— a Python SDK with a `@trust_annotated` decorator, `to_wire`/`from_wire` +round-tripping, a policy engine (audit/warn/enforce), a healthcare-scenario +demo, and an LLM-based usability study (138 tests). The SDK predates this +narrowed shape and is being aligned to the two-boolean + `evidenceRef` model. + +## Trust model + +Enforcement does not rest on developer honesty. See +[docs/trust-model.md](../../docs/trust-model.md): registries and marketplaces +verifying annotations are the enforcement layer; the annotation is a *claim*, +and `evidenceRef` is how that claim is made checkable. + +## Open questions + +- Is `sensitive` the right single coarse signal, or do we need `sensitive` + + the `data-class.v1` profile from day one? +- Exact required-vs-recommended split on `evidenceRef` fields. +- Whether content-block-level annotation needs a worked multi-result example + before the draft is implementable. + +## Changelog + +| Date | Change | +| ---------- | ------------------------------------------------------------- | +| 2026-06-10 | Initial draft skeleton. Narrowed to `sensitive` + `untrusted` + `evidenceRef`; DataClass demoted to a profile; `requires_review` moved to `action-metadata`. | From 8a867d4bb95b6f5e067f2c8c4f834c0099cdb4af Mon Sep 17 00:00:00 2001 From: Sam Morrow Date: Thu, 11 Jun 2026 00:02:51 +0200 Subject: [PATCH 02/10] docs: finalize intent comments and record posted links Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/intent-comment.md | 40 ++++++++++++++++------------------------ 1 file changed, 16 insertions(+), 24 deletions(-) diff --git a/docs/intent-comment.md b/docs/intent-comment.md index 319ce77..c85350e 100644 --- a/docs/intent-comment.md +++ b/docs/intent-comment.md @@ -19,6 +19,7 @@ Kept here so it can be reviewed and stays in sync with > they're a better fit for this work than a single Standards Track SEP. > > Two things pushed us here: +> > - @localden's review ask for a **narrower first cut** — the concern that a > broad taxonomy with array-or-scalar polymorphism is hard to remove or change > once it lands. @@ -31,39 +32,30 @@ Kept here so it can be reviewed and stays in sync with > independently-shippable extensions**, each with its own > `io.modelcontextprotocol/…` identifier, reference implementation, and path to > an Extensions Track SEP. Incubation is in -> [`experimental-ext-tool-annotations`](https://github.com/modelcontextprotocol/experimental-ext-tool-annotations): +> [`experimental-ext-tool-annotations`](https://github.com/modelcontextprotocol/experimental-ext-tool-annotations). +> +> Kicking off with a few draft extensions in the tool annotations repo — not +> sure yet whether they'd each need separate repos eventually, or whether +> grouping them in one is fine. That's part of what incubation is for. > > | Extension | Scope | > |---|---| > | `trust-annotations` | The narrow data-classification taxonomy (`sensitive`, `untrusted`) + an open-ended `evidenceRef` pointer for richer, out-of-band evidence. | > | `action-metadata` | Tool I/O + outcome contract (folds in @rreichel3's SEP-2061). | -> | `ifc-fides` | Information-flow control ([arXiv:2505.23643](https://arxiv.org/abs/2505.23643)) as **one profile** of `evidenceRef`, not a wire root — the public/private-repo confidentiality case, with github-mcp-server as emitter candidate. | +> | `ifc-fides` | Information-flow control ([arXiv:2505.23643](https://arxiv.org/abs/2505.23643)) as **one profile** of `evidenceRef`, not a wire root — the public/private-repo confidentiality case, with github-mcp-server as an emitter example. | > -> Deliberately **not** in the initial carve: `maliciousActivityHint` (the -> structural concerns raised here are unresolved) and session-level propagation -> rules. Those stay parked on this umbrella thread. +> Deliberately removed: `maliciousActivityHint` (the structural concerns raised +> here are unresolved) and session-level propagation rules. > > This follows the same Standards-Track → Extensions-Track refactor pattern as -> SEP-2127 (#2893). This PR stays open as the umbrella / problem-framing thread; -> the schema-bearing pieces move out. Feedback on the carve is very welcome — -> particularly on whether any parked item deserves its own extension sooner. +> SEP-2127 (#2893). This PR will eventually pivot to the `trust-annotations` +> piece itself, with the other schema-bearing pieces moving out into their own +> extensions. Everything is still in the incubation phase, so naming, design, +> and the choice of what to put forward as an extension are all open for +> discussion in the IG. --- -## For SEP-2061 (abbreviated pointer) +## For SEP-2061 (coordination note) -> Cross-linking for visibility: the Tool Annotations IG is carving the trust / -> privacy / action-metadata work into small experimental extensions in -> [`experimental-ext-tool-annotations`](https://github.com/modelcontextprotocol/experimental-ext-tool-annotations) -> (background: SEP-1913 comment -> [here](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913), -> and the [May 28 IG decision](https://github.com/modelcontextprotocol/modelcontextprotocol/discussions/2820)). -> -> This proposal maps directly onto the -> [`action-metadata`](https://github.com/modelcontextprotocol/experimental-ext-tool-annotations/blob/main/specification/draft/action-metadata.mdx) -> extension — `inputMetadata` / `returnMetadata` / outcomes, plus a -> `requiresReview` signal we pulled out of the trust taxonomy because it's a -> workflow concern, not a data property. The intent is to carry SEP-2061 forward -> as that extension rather than run a parallel proposal. @rreichel3 — flagging -> so we co-own it rather than diverge; happy to keep this thread as the home for -> the field semantics. +> @rreichel3 — carving this out into an independent extension as discussed: https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#issuecomment-4675047154 From 70ec91c5f7cb8552a2aed23b33677e5930147180 Mon Sep 17 00:00:00 2001 From: Sam Morrow Date: Thu, 11 Jun 2026 00:12:30 +0200 Subject: [PATCH 03/10] docs: drop jargon, make repo docs descriptive Replace 'carve'/'headline' throughout; reframe the front-facing docs (README, spec drafts, related-work) to describe what the repo contains and link to it rather than narrate the process. Process framing stays in the decision log and the intent-comment/sep-disposition planning docs. Live SEP-1913/SEP-2061 comments updated to match. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- README.md | 34 +++++++++++------------ docs/decisions.md | 8 +++--- docs/intent-comment.md | 22 +++++++-------- docs/open-questions.md | 2 +- docs/related-work.md | 2 +- docs/sep-disposition.md | 16 +++++------ specification/draft/ifc-fides.mdx | 21 +++++++------- specification/draft/trust-annotations.mdx | 7 +++-- 8 files changed, 56 insertions(+), 56 deletions(-) diff --git a/README.md b/README.md index 4f0dbae..8c07123 100644 --- a/README.md +++ b/README.md @@ -38,41 +38,41 @@ See [docs/decisions.md](docs/decisions.md) for the decision record and | Identifier | Status | What it specifies | Reference implementation(s) | | :--- | :--- | :--- | :--- | -| [`io.modelcontextprotocol/trust-annotations`](specification/draft/trust-annotations.mdx) | Draft skeleton | **Headline.** A small, scheme-agnostic client-facing data-classification vocabulary (`sensitive`, `untrusted`) on result `_meta`, plus an optional `evidenceRef` pointer slot that carries richer payloads out-of-band. | Python SDK: [`kapil8811/mcp-trust-annotations`](https://github.com/kapil8811/mcp-trust-annotations) (138-test suite, healthcare demo, LLM usability study). | +| [`io.modelcontextprotocol/trust-annotations`](specification/draft/trust-annotations.mdx) | Draft skeleton | **Primary extension.** A small, scheme-agnostic client-facing data-classification vocabulary (`sensitive`, `untrusted`) on result `_meta`, plus an optional `evidenceRef` pointer slot that carries richer payloads out-of-band. | Python SDK: [`kapil8811/mcp-trust-annotations`](https://github.com/kapil8811/mcp-trust-annotations) (138-test suite, healthcare demo, LLM usability study). | | [`io.modelcontextprotocol/action-metadata`](specification/draft/action-metadata.mdx) | Draft skeleton | `inputMetadata` / `returnMetadata` / outcome classifiers (incl. `requires_review`) on `ToolAnnotations`, describing where inputs go, where outputs originate, and what real-world effects a tool can cause. | Carries forward [SEP-2061 (Action Security Metadata)](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061) by [@rreichel3](https://github.com/rreichel3); reference impl per that proposal (`read_drafts` / `list_inbox` / `send_email`). | | [`io.modelcontextprotocol/ifc-fides`](specification/draft/ifc-fides.mdx) | Draft skeleton | A **profile** of the `trust-annotations` `evidenceRef` slot: `type: "ifc.fides.v1"` carrying an integrity + confidentiality label for deterministic information-flow control, following the FIDES paper ([arXiv:2505.23643](https://arxiv.org/abs/2505.23643)). | Emitter candidate: [`github-mcp-server`](https://github.com/github/github-mcp-server) (does not emit IFC labels today — closing that gap is the proof point). | -### Why FIDES is a profile, not the headline +### Why FIDES is a profile, not a top-level extension -An earlier sketch made information-flow control the top-level extension. That -was the wrong cut: it bakes one academic model (an integrity × confidentiality -lattice) into the namespace root and silently forecloses the other enforcement -models reviewers raised — capability tokens, caller/tool cosigning, and -sequence-shape audit records. As one reviewer put it, IFC "fits relatively well -if you use annotations" — an endorsement of IFC *as a profile*, not as the wire -root. Demoting it to a `type` value under `trust-annotations`'s open-ended -`evidenceRef` slot keeps the FIDES work first-class while leaving room for every -other model to occupy the same slot. +Information-flow control is modelled as a profile rather than the namespace +root because IFC (an integrity × confidentiality lattice) is one enforcement +model among several that reviewers raised — capability tokens, caller/tool +cosigning, and sequence-shape audit records. A top-level `ifc/` root would bake +one academic model into the namespace and foreclose the others. As one reviewer +put it, IFC "fits relatively well if you use annotations" — an endorsement of +IFC *as a profile*, not as the wire root. As a `type` value under +`trust-annotations`'s open-ended `evidenceRef` slot, the FIDES work stays +first-class while every other model can occupy the same slot. ## Relationship to SEP-1913 SEP-1913 remains the canonical place to discuss the overall problem framing. -This repo carves the schema-bearing parts of that proposal into independently -shippable pieces. When an extension here is ready to graduate, an Extensions -Track SEP can reference this repo as the prior art and the working +This repository develops the schema-bearing parts of that proposal as +independently shippable extensions. When an extension here is ready to graduate, +an Extensions Track SEP can reference this repo as the prior art and the working implementation that SEP-2133 [requires](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md#creation). For the full per-SEP plan — what happens to SEP-1913, SEP-2061, SEP-1862 and others, and the SEP-2127 refactor precedent — see [docs/sep-disposition.md](docs/sep-disposition.md). -**Deliberately not carved here** (see [docs/open-questions.md](docs/open-questions.md)): +**Out of scope for these extensions** (see [docs/open-questions.md](docs/open-questions.md)): - **`maliciousActivityHint`** — reviewer concerns are structural (it fires at `tools/resolve` before execution can produce evidence; a boolean is the wrong granularity for client UX; clients won't trust server self-attestation). If it - returns, it is per-`ContentBlock` with spans, on a different clock. Parked on - the SEP-1913 umbrella as a known cut item. + returns, it is per-`ContentBlock` with spans, on a different clock. It stays + on the SEP-1913 umbrella rather than in an extension here. - **Propagation rules** — sensitivity escalation across session boundaries, and the sequence-shape gap (an annotation surface for "this was call N in a flagged sequence") remain open. Likely a future extension once the taxonomy diff --git a/docs/decisions.md b/docs/decisions.md index ff4e23b..33ecb8b 100644 --- a/docs/decisions.md +++ b/docs/decisions.md @@ -3,7 +3,7 @@ Append-only record of design decisions for the Tool Annotations IG's trust / privacy extension work. Newest at the bottom. -## 2026-06-10 — Carve SEP-1913 into independent extensions +## 2026-06-10 — Split SEP-1913 into independent extensions **Decision.** Split the schema-bearing parts of SEP-1913 into separate experimental extensions, each with its own `io.modelcontextprotocol/…` @@ -17,13 +17,13 @@ clock and avoid hard-to-remove schema. ## 2026-06-10 — Three initial extensions -**Decision.** `trust-annotations` (headline), `action-metadata`, `ifc-fides`. +**Decision.** `trust-annotations` (primary), `action-metadata`, `ifc-fides`. **Rationale.** These are the three pieces with either a reference implementation or an existing SEP behind them: Kapil's SDK, SEP-2061 (Reichel), and the FIDES model respectively. -## 2026-06-10 — FIDES is a profile, not the headline +## 2026-06-10 — FIDES is a profile, not a top-level extension **Decision.** Information-flow control is `type: "ifc.fides.v1"`, a profile of the `trust-annotations` `evidenceRef` slot — not a top-level `io.modelcontextprotocol/ifc` @@ -65,7 +65,7 @@ change. ## 2026-06-10 — Parked: maliciousActivityHint, propagation rules -**Decision.** Neither is carved into an extension now; both stay on the SEP-1913 +**Decision.** Neither becomes an extension now; both stay on the SEP-1913 umbrella. **Rationale.** `maliciousActivityHint` has unresolved structural objections diff --git a/docs/intent-comment.md b/docs/intent-comment.md index c85350e..ecc3bca 100644 --- a/docs/intent-comment.md +++ b/docs/intent-comment.md @@ -12,14 +12,14 @@ Kept here so it can be reviewed and stays in sync with ## For SEP-1913 > **Intent: split this SEP and migrate to the Extensions Track** -> +> > > A note on direction for everyone following this thread. When SEP-1913 was > first framed, the **Extensions Track** (SEP-2133) and the `experimental-ext-*` > incubation process didn't exist in their current form. They now do, and > they're a better fit for this work than a single Standards Track SEP. -> +> > > Two things pushed us here: -> +> > > - @localden's review ask for a **narrower first cut** — the concern that a > broad taxonomy with array-or-scalar polymorphism is hard to remove or change > once it lands. @@ -27,26 +27,26 @@ Kept here so it can be reviewed and stays in sync with > [May 28 decision](https://github.com/modelcontextprotocol/modelcontextprotocol/discussions/2820) > to pursue trust/privacy as an **experimental extension first**, gather > adoption evidence, then ask core maintainers to absorb anything. -> -> So the plan is to **carve this proposal into a few small, -> independently-shippable extensions**, each with its own +> > +> So the plan is to \*\*split this proposal into a few small, +> independently-shippable extensions\*\*, each with its own > `io.modelcontextprotocol/…` identifier, reference implementation, and path to > an Extensions Track SEP. Incubation is in > [`experimental-ext-tool-annotations`](https://github.com/modelcontextprotocol/experimental-ext-tool-annotations). -> +> > > Kicking off with a few draft extensions in the tool annotations repo — not > sure yet whether they'd each need separate repos eventually, or whether > grouping them in one is fine. That's part of what incubation is for. -> +> > > | Extension | Scope | > |---|---| > | `trust-annotations` | The narrow data-classification taxonomy (`sensitive`, `untrusted`) + an open-ended `evidenceRef` pointer for richer, out-of-band evidence. | > | `action-metadata` | Tool I/O + outcome contract (folds in @rreichel3's SEP-2061). | > | `ifc-fides` | Information-flow control ([arXiv:2505.23643](https://arxiv.org/abs/2505.23643)) as **one profile** of `evidenceRef`, not a wire root — the public/private-repo confidentiality case, with github-mcp-server as an emitter example. | -> +> > > Deliberately removed: `maliciousActivityHint` (the structural concerns raised > here are unresolved) and session-level propagation rules. -> +> > > This follows the same Standards-Track → Extensions-Track refactor pattern as > SEP-2127 (#2893). This PR will eventually pivot to the `trust-annotations` > piece itself, with the other schema-bearing pieces moving out into their own @@ -58,4 +58,4 @@ Kept here so it can be reviewed and stays in sync with ## For SEP-2061 (coordination note) -> @rreichel3 — carving this out into an independent extension as discussed: https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#issuecomment-4675047154 +> @rreichel3 — splitting this out into an independent extension as discussed: https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#issuecomment-4675047154 \ No newline at end of file diff --git a/docs/open-questions.md b/docs/open-questions.md index c4747c7..57deb26 100644 --- a/docs/open-questions.md +++ b/docs/open-questions.md @@ -36,7 +36,7 @@ Tracked here rather than in the spec drafts, so the drafts stay non-temporal. - GitHub Enterprise `internal` repo visibility → public/private/reader-set mapping (audience is the whole org, broader than collaborators). -## Parked (SEP-1913 umbrella, not carved) +## Parked (SEP-1913 umbrella) - **`maliciousActivityHint`** — if it returns, it is per-`ContentBlock` with spans, driven by the host's own detection, not a server-attested boolean. diff --git a/docs/related-work.md b/docs/related-work.md index c8c6e16..f5717fe 100644 --- a/docs/related-work.md +++ b/docs/related-work.md @@ -5,7 +5,7 @@ annotation work. Several were surfaced in IG meetings (notably 2026-05-28). ## SEPs -- [SEP-1913 — Trust and Sensitivity Annotations](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913) — the umbrella this work carves from. +- [SEP-1913 — Trust and Sensitivity Annotations](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913) — the umbrella proposal these extensions derive from. - [SEP-2061 — Action Security Metadata](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061) — carried forward as `action-metadata`. - [SEP-1862 — Tool Resolution / pre-flight checks](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1862) — core-protocol, composes with these extensions. - [SEP-2133 — Extensions](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md) — the framework this repo incubates under. diff --git a/docs/sep-disposition.md b/docs/sep-disposition.md index f930340..ab22421 100644 --- a/docs/sep-disposition.md +++ b/docs/sep-disposition.md @@ -25,7 +25,7 @@ form. Two inputs since then point at a different shape: adoption/evidence base, and only then ask core maintainers to absorb anything into the protocol. -The result: carve the schema-bearing parts into small, independent extensions, +The result: split the schema-bearing parts into small, independent extensions, each able to graduate on its own clock. ## The precedent: SEP-2127 (Server Cards) @@ -50,7 +50,7 @@ We apply the same playbook below. ### SEP-1913 — Trust and Sensitivity Annotations **Proposed:** becomes the **umbrella / problem-framing** thread. The schema- -bearing content is carved out into the extensions below. Options, in order of +bearing content moves into the extensions below. Options, in order of preference: - **(A, preferred)** Keep the PR open as the framing umbrella; add an intent @@ -62,8 +62,8 @@ preference: - **(C)** Close 1913 outright and open three fresh Extensions Track SEPs. Loses the discussion history's continuity; not preferred. -**Carved out:** `trust-annotations`, `action-metadata`, `ifc-fides`. -**Parked on the umbrella (not carved):** `maliciousActivityHint`, +**Moved into extensions:** `trust-annotations`, `action-metadata`, `ifc-fides`. +**Parked on the umbrella:** `maliciousActivityHint`, session-level propagation rules. See [open-questions.md](./open-questions.md). ### SEP-2061 — Action Security Metadata @@ -73,7 +73,7 @@ extension. SEP-2061 is by [@rreichel3](https://github.com/rreichel3), who is also an IG co-facilitator and SEP-1913 co-author, so this is a fold-in, not a collision. Disposition mirrors 1913 option (A): keep the thread as the field- semantics discussion, add a pointer comment linking it to the -`action-metadata` carve, refactor to Extensions Track when ready. +`action-metadata` extension, refactor to Extensions Track when ready. ### SEP-1862 — Tool Resolution (pre-flight checks) @@ -86,8 +86,8 @@ with it if it lands, but do not block on it. ### Other related SEPs (not owned here) - **SEP-1984 (Comprehensive Tool Annotations)**, **SEP-2417 (Model Preferences - for Tools)** — tracked by the IG as discussion items; not part of this carve. - Cross-link only. + for Tools)** — tracked by the IG as discussion items; not part of these + extensions. Cross-link only. - **SEP-2787 (Tool Call Attestation)** and the various attestation/evidence threads — these are natural `evidenceRef` *profile* candidates rather than competitors. Coordinate so the `evidenceRef.type` registry can list them. @@ -96,7 +96,7 @@ with it if it lands, but do not block on it. | SEP | Title | Proposed disposition | Extension home | | :-- | :-- | :-- | :-- | -| 1913 | Trust & Sensitivity Annotations | Umbrella thread; carve schema out | `trust-annotations` (+ `ifc-fides`) | +| 1913 | Trust & Sensitivity Annotations | Umbrella thread; schema moves to extensions | `trust-annotations` (+ `ifc-fides`) | | 2061 | Action Security Metadata | Fold into extension | `action-metadata` | | 1862 | Tool Resolution (pre-flight) | Stays core / Standards Track | — (composes, no dependency) | | 1984 | Comprehensive Tool Annotations | IG discussion item | — | diff --git a/specification/draft/ifc-fides.mdx b/specification/draft/ifc-fides.mdx index 5895d0a..eb1ea83 100644 --- a/specification/draft/ifc-fides.mdx +++ b/specification/draft/ifc-fides.mdx @@ -26,17 +26,16 @@ protocol or into the `trust-annotations` wire surface. ## Why a profile -An earlier sketch proposed information-flow control as a top-level extension -(`io.modelcontextprotocol/ifc`). That was the wrong cut. IFC is **one** -enforcement model among several that reviewers of SEP-1913 raised — capability -tokens, caller/tool cosigning, and sequence-shape audit records were all put -forward. Making the FIDES integrity × confidentiality lattice the namespace -root would silently foreclose those. - -Carrying it as a `type` value under the open-ended `evidenceRef` slot keeps the -FIDES work first-class while leaving the slot free for every other model. One -reviewer's framing captured it: IFC "fits relatively well *if you use -annotations*" — an endorsement of IFC as a profile, not as the wire root. +Information-flow control is one enforcement model among several that reviewers +of SEP-1913 raised — capability tokens, caller/tool cosigning, and +sequence-shape audit records were all put forward. A top-level extension +(`io.modelcontextprotocol/ifc`) would make the FIDES integrity × confidentiality +lattice the namespace root and silently foreclose those other models. + +As a `type` value under the open-ended `evidenceRef` slot, the FIDES label is +first-class while the slot stays free for every other model. One reviewer's +framing captured it: IFC "fits relatively well *if you use annotations*" — an +endorsement of IFC as a profile, not as the wire root. ## Motivation diff --git a/specification/draft/trust-annotations.mdx b/specification/draft/trust-annotations.mdx index d43200c..f9fc4bb 100644 --- a/specification/draft/trust-annotations.mdx +++ b/specification/draft/trust-annotations.mdx @@ -20,9 +20,10 @@ related: This extension defines a small, stable, scheme-agnostic vocabulary for classifying **data in transit** through MCP tool results, plus an optional `evidenceRef` pointer that lets a deployment attach richer, out-of-band -evidence without growing the on-wire schema. It is the headline extension of -the Tool Annotations IG's trust work; other extensions (information-flow -control, action metadata) compose with it rather than duplicating it. +evidence without growing the on-wire schema. It is the primary +data-classification extension in the Tool Annotations IG's trust work; other +extensions (information-flow control, action metadata) compose with it rather +than duplicating it. The design follows the [@localden](https://github.com/localden) review steer on SEP-1913 — take a *narrow first cut* of the taxonomy and avoid hard-to-remove From 716eb8d89facbee07973243a91a8bd5bce39a337 Mon Sep 17 00:00:00 2001 From: Sam Morrow Date: Thu, 11 Jun 2026 00:14:19 +0200 Subject: [PATCH 04/10] docs: trim spec frontmatter to known fields (title only) Official extension specs (ext-auth) use only `title:` in frontmatter plus an protocol-revision marker. Remove the invented extension_identifier / status / maintainers / related / profile_of fields and restate the identifier and profile relationship in the body prose where they belong. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- specification/draft/action-metadata.mdx | 12 ++++-------- specification/draft/ifc-fides.mdx | 12 ++++-------- specification/draft/trust-annotations.mdx | 12 ++++-------- 3 files changed, 12 insertions(+), 24 deletions(-) diff --git a/specification/draft/action-metadata.mdx b/specification/draft/action-metadata.mdx index 766f3c2..5ea8d8c 100644 --- a/specification/draft/action-metadata.mdx +++ b/specification/draft/action-metadata.mdx @@ -1,15 +1,11 @@ --- title: Action Metadata -extension_identifier: io.modelcontextprotocol/action-metadata -status: Draft (experimental) -maintainers: - - "@rreichel3" - - "@SamMorrowDrums" -related: - - https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061 - - https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913 --- +**Protocol Revision**: draft + +**Extension identifier:** `io.modelcontextprotocol/action-metadata` + > ⚠️ **Experimental draft skeleton.** This carries forward > [SEP-2061: Action Security Metadata](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061) > by [@rreichel3](https://github.com/rreichel3) into the IG's experimental repo, diff --git a/specification/draft/ifc-fides.mdx b/specification/draft/ifc-fides.mdx index eb1ea83..328ed17 100644 --- a/specification/draft/ifc-fides.mdx +++ b/specification/draft/ifc-fides.mdx @@ -1,15 +1,11 @@ --- title: Information-Flow Control (FIDES profile) -extension_identifier: io.modelcontextprotocol/ifc-fides -status: Draft (experimental) -maintainers: - - "@SamMorrowDrums" -profile_of: io.modelcontextprotocol/trust-annotations -related: - - https://arxiv.org/abs/2505.23643 - - https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913 --- +**Protocol Revision**: draft + +**Extension identifier:** `io.modelcontextprotocol/ifc-fides`  ·  **Profile of:** `io.modelcontextprotocol/trust-annotations` + > ⚠️ **Experimental draft skeleton.** This defines a *profile* of the > [`trust-annotations`](./trust-annotations.mdx) `evidenceRef` slot. It is **not** > a standalone wire root — see [Why a profile](#why-a-profile). diff --git a/specification/draft/trust-annotations.mdx b/specification/draft/trust-annotations.mdx index f9fc4bb..0fc51d4 100644 --- a/specification/draft/trust-annotations.mdx +++ b/specification/draft/trust-annotations.mdx @@ -1,15 +1,11 @@ --- title: Trust Annotations -extension_identifier: io.modelcontextprotocol/trust-annotations -status: Draft (experimental) -maintainers: - - "@SamMorrowDrums" - - "@rreichel3" -related: - - https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913 - - https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md --- +**Protocol Revision**: draft + +**Extension identifier:** `io.modelcontextprotocol/trust-annotations` + > ⚠️ **Experimental draft skeleton.** This document captures the agreed shape > and the open questions. Normative text is intentionally thin pending > reference-implementation validation. Substantive discussion happens on PRs From b799af5bea7c3cb0f63368a873d33cb09d3a9027 Mon Sep 17 00:00:00 2001 From: Sam Morrow Date: Mon, 15 Jun 2026 23:29:44 +0200 Subject: [PATCH 05/10] spec(ifc-fides): address review; reframe action-metadata for SEP-2061 closure ifc-fides (review by @JoannaaKL): - confidentiality is public/private only on the wire; drop the reader-list (logins aren't uniform across servers, are themselves access-restricted, and reader sets can be huge) - add Reader-set resolution section: two private markers aren't equal, the confidentiality join is the intersection and isn't computable from opaque wire markers, so hosts resolve to concrete reader sets at decision time - integrity join is total/wire-computable; confidentiality join is not - emitter MUST classify per resource, not per repository (public repos serve non-world-readable sub-resources), making github-mcp-server a real proof point action-metadata / docs: - SEP-2061 closed 2026-06-13 in favour of this extension; this draft is now the canonical home. Updated action-metadata.mdx, README, sep-disposition, related-work, and intent-comment status accordingly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- README.md | 2 +- docs/intent-comment.md | 17 +++--- docs/open-questions.md | 8 ++- docs/related-work.md | 2 +- docs/sep-disposition.md | 15 +++--- specification/draft/action-metadata.mdx | 12 +++-- specification/draft/ifc-fides.mdx | 71 +++++++++++++++++++------ 7 files changed, 87 insertions(+), 40 deletions(-) diff --git a/README.md b/README.md index 8c07123..2efc8ad 100644 --- a/README.md +++ b/README.md @@ -39,7 +39,7 @@ See [docs/decisions.md](docs/decisions.md) for the decision record and | Identifier | Status | What it specifies | Reference implementation(s) | | :--- | :--- | :--- | :--- | | [`io.modelcontextprotocol/trust-annotations`](specification/draft/trust-annotations.mdx) | Draft skeleton | **Primary extension.** A small, scheme-agnostic client-facing data-classification vocabulary (`sensitive`, `untrusted`) on result `_meta`, plus an optional `evidenceRef` pointer slot that carries richer payloads out-of-band. | Python SDK: [`kapil8811/mcp-trust-annotations`](https://github.com/kapil8811/mcp-trust-annotations) (138-test suite, healthcare demo, LLM usability study). | -| [`io.modelcontextprotocol/action-metadata`](specification/draft/action-metadata.mdx) | Draft skeleton | `inputMetadata` / `returnMetadata` / outcome classifiers (incl. `requires_review`) on `ToolAnnotations`, describing where inputs go, where outputs originate, and what real-world effects a tool can cause. | Carries forward [SEP-2061 (Action Security Metadata)](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061) by [@rreichel3](https://github.com/rreichel3); reference impl per that proposal (`read_drafts` / `list_inbox` / `send_email`). | +| [`io.modelcontextprotocol/action-metadata`](specification/draft/action-metadata.mdx) | Draft skeleton | `inputMetadata` / `returnMetadata` / outcome classifiers (incl. `requires_review`) on `ToolAnnotations`, describing where inputs go, where outputs originate, and what real-world effects a tool can cause. | Originally [SEP-2061 (Action Security Metadata)](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061) by [@rreichel3](https://github.com/rreichel3) — closed 2026-06-13 in favour of this extension; worked example `read_drafts` / `list_inbox` / `send_email`. | | [`io.modelcontextprotocol/ifc-fides`](specification/draft/ifc-fides.mdx) | Draft skeleton | A **profile** of the `trust-annotations` `evidenceRef` slot: `type: "ifc.fides.v1"` carrying an integrity + confidentiality label for deterministic information-flow control, following the FIDES paper ([arXiv:2505.23643](https://arxiv.org/abs/2505.23643)). | Emitter candidate: [`github-mcp-server`](https://github.com/github/github-mcp-server) (does not emit IFC labels today — closing that gap is the proof point). | ### Why FIDES is a profile, not a top-level extension diff --git a/docs/intent-comment.md b/docs/intent-comment.md index ecc3bca..7a99730 100644 --- a/docs/intent-comment.md +++ b/docs/intent-comment.md @@ -1,11 +1,12 @@ -# Intent comment (draft, pre-post review) - -This is the comment we plan to post on -[SEP-1913](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913), -with an abbreviated pointer version for -[SEP-2061](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061). -Kept here so it can be reviewed and stays in sync with -[sep-disposition.md](./sep-disposition.md). +# Intent comments (posted) + +Both comments below have been **posted**. Kept here as the source of record, +in sync with [sep-disposition.md](./sep-disposition.md). + +- **SEP-1913** umbrella comment — [posted 2026-06-10](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#issuecomment-4675047154). +- **SEP-2061** coordination note — [posted 2026-06-10](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061#issuecomment-4675049171); + @localden then **closed SEP-2061 on 2026-06-13** in favour of the + `action-metadata` extension. --- diff --git a/docs/open-questions.md b/docs/open-questions.md index 57deb26..4d5749e 100644 --- a/docs/open-questions.md +++ b/docs/open-questions.md @@ -33,8 +33,12 @@ Tracked here rather than in the spec drafts, so the drafts stay non-temporal. ## ifc-fides - Inline `_meta.ifc` for low-friction adoption vs. always behind `evidenceRef`. -- GitHub Enterprise `internal` repo visibility → public/private/reader-set - mapping (audience is the whole org, broader than collaborators). +- GitHub Enterprise `internal` repo visibility → `public`/`private` mapping + (audience is the whole org, broader than collaborators; resolved host-side). +- Reader-set resolution is host-side by design — confidentiality join across two + `private` sources needs the intersection, which the opaque wire marker can't + express. Is the 3-step host resolution enough, or do some hosts need a + standard `evidenceRef.ref` shape to locate the originating system? ## Parked (SEP-1913 umbrella) diff --git a/docs/related-work.md b/docs/related-work.md index f5717fe..fdaf230 100644 --- a/docs/related-work.md +++ b/docs/related-work.md @@ -6,7 +6,7 @@ annotation work. Several were surfaced in IG meetings (notably 2026-05-28). ## SEPs - [SEP-1913 — Trust and Sensitivity Annotations](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913) — the umbrella proposal these extensions derive from. -- [SEP-2061 — Action Security Metadata](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061) — carried forward as `action-metadata`. +- [SEP-2061 — Action Security Metadata](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061) — closed 2026-06-13; carried forward as `action-metadata`. - [SEP-1862 — Tool Resolution / pre-flight checks](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1862) — core-protocol, composes with these extensions. - [SEP-2133 — Extensions](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md) — the framework this repo incubates under. - [SEP-2127 — Server Cards](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2893) — precedent for the Standards→Extensions Track refactor. diff --git a/docs/sep-disposition.md b/docs/sep-disposition.md index ab22421..18e2505 100644 --- a/docs/sep-disposition.md +++ b/docs/sep-disposition.md @@ -68,12 +68,13 @@ session-level propagation rules. See [open-questions.md](./open-questions.md). ### SEP-2061 — Action Security Metadata -**Proposed:** becomes the [`action-metadata`](../specification/draft/action-metadata.mdx) -extension. SEP-2061 is by [@rreichel3](https://github.com/rreichel3), who is -also an IG co-facilitator and SEP-1913 co-author, so this is a fold-in, not a -collision. Disposition mirrors 1913 option (A): keep the thread as the field- -semantics discussion, add a pointer comment linking it to the -`action-metadata` extension, refactor to Extensions Track when ready. +**Disposition:** **closed 2026-06-13** in favour of the +[`action-metadata`](../specification/draft/action-metadata.mdx) extension. +SEP-2061 is by [@rreichel3](https://github.com/rreichel3), who is also an IG +co-facilitator and SEP-1913 co-author, so this was a fold-in, not a collision. +[@localden](https://github.com/localden) closed the PR (no active sponsor) after +agreeing the extension is the right home; the extension now carries the field +semantics forward, with SEP-2061 preserved as the origin and credit. ### SEP-1862 — Tool Resolution (pre-flight checks) @@ -97,7 +98,7 @@ with it if it lands, but do not block on it. | SEP | Title | Proposed disposition | Extension home | | :-- | :-- | :-- | :-- | | 1913 | Trust & Sensitivity Annotations | Umbrella thread; schema moves to extensions | `trust-annotations` (+ `ifc-fides`) | -| 2061 | Action Security Metadata | Fold into extension | `action-metadata` | +| 2061 | Action Security Metadata | **Closed 2026-06-13**; lives as extension | `action-metadata` | | 1862 | Tool Resolution (pre-flight) | Stays core / Standards Track | — (composes, no dependency) | | 1984 | Comprehensive Tool Annotations | IG discussion item | — | | 2417 | Model Preferences for Tools | IG discussion item | — | diff --git a/specification/draft/action-metadata.mdx b/specification/draft/action-metadata.mdx index 5ea8d8c..dd27eee 100644 --- a/specification/draft/action-metadata.mdx +++ b/specification/draft/action-metadata.mdx @@ -10,8 +10,9 @@ title: Action Metadata > [SEP-2061: Action Security Metadata](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061) > by [@rreichel3](https://github.com/rreichel3) into the IG's experimental repo, > per the May 28 2026 decision to pursue trust/privacy work as an extension -> first. SEP-2061 remains the canonical discussion thread for the field -> semantics. +> first. SEP-2061 was [closed](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061#issuecomment-4675049171) +> on 2026-06-13 in favour of this extension as the home for the work; this draft +> is now the canonical place to discuss the field semantics. ## Abstract @@ -81,9 +82,9 @@ Carried under the extension-namespaced key on `ToolAnnotations`: | `outcome` | Real-world effect class: `benign` / `consequential` / `irreversible`. | | `requiresReview` | The tool author signals that a host SHOULD obtain explicit human confirmation before invocation. | -> Exact enum value sets are inherited from SEP-2061 and are **not** re-litigated -> here; this draft tracks that proposal. Where SEP-2061 evolves, this file -> follows. +> Exact enum value sets are inherited from SEP-2061 as the starting point and +> are **not** re-litigated here. With SEP-2061 now closed, this draft is where +> they evolve. ### `requiresReview` lives here, deliberately @@ -130,3 +131,4 @@ would anchor the draft in a real ecosystem. | Date | Change | | ---------- | ------------------------------------------------------------------- | | 2026-06-10 | Initial draft skeleton, carrying SEP-2061 into the experimental repo; absorbed `requiresReview` from the trust taxonomy. | +| 2026-06-15 | SEP-2061 closed in favour of this extension; this draft is now the canonical home for the field semantics. | diff --git a/specification/draft/ifc-fides.mdx b/specification/draft/ifc-fides.mdx index 328ed17..7f9af0a 100644 --- a/specification/draft/ifc-fides.mdx +++ b/specification/draft/ifc-fides.mdx @@ -44,11 +44,12 @@ integrity (is this data trusted) as context accumulates across tool calls, and deny or prompt before a flow violates policy. A public MCP server is the natural emitter. [`github-mcp-server`](https://github.com/github/github-mcp-server) -returns repository data whose confidentiality is determined by repo visibility -and collaborator sets — exactly the public/private signal above — but does -**not** emit IFC labels today. Closing that emitter gap is the concrete proof -point for this profile: a host-side consumer of the label shape already exists, -so the missing half is a server willing to emit it. +returns repository data whose confidentiality follows from repository visibility +and collaborator sets — the same public/private signal — but does **not** emit +IFC labels today. Closing that emitter gap is the concrete proof point for this +profile: a host-side consumer of the label shape already exists, so the missing +half is a server willing to emit it, classifying each resource it returns (see +[per-resource classification](#reference-implementation)). ## Specification @@ -67,32 +68,62 @@ be inlined by deployments that accept the wire cost) has the shape: ```jsonc { "integrity": "trusted", // "trusted" | "untrusted" (FIDES §4.1 two-level lattice) - "confidentiality": "public" // "public" | "private" | ["login1","login2", …] + "confidentiality": "public" // "public" | "private" } ``` | Field | Meaning | | :--- | :--- | | `integrity` | Two-level integrity lattice (`trusted` ⊑ `untrusted`): trusted data may flow to untrusted sinks, not vice versa. | -| `confidentiality` | `"public"` = world-readable; `"private"` = an opaque marker the host resolves to a reader set (e.g. repo collaborators); an explicit `string[]` = pre-resolved reader logins. Fewer readers = more confidential = higher in the lattice. | +| `confidentiality` | `"public"` = world-readable; `"private"` = an opaque marker meaning "restricted to some reader set". The concrete reader set is resolved host-side at policy-decision time (see [Reader-set resolution](#reader-set-resolution)). | + +> **Confidentiality is `public` / `private` only — never a reader list on the +> wire.** Emitting concrete reader identities (e.g. logins) is out of scope: user +> identity is not uniform across servers using different auth methods, the +> identities are themselves access-restricted data, and a single resource can +> have hundreds of readers. The opaque marker keeps the wire shape stable and the +> sensitive resolution host-side. ### Label semantics - **Join on accumulation.** As a session ingests labeled results, the context label is the *join* of what it has seen: integrity degrades toward - `untrusted`, confidentiality narrows toward the smallest reader set. + `untrusted`, confidentiality narrows toward the smallest permitted reader set. + The **integrity join is total and computable from the wire values alone** + (`untrusted` dominates). The **confidentiality join is NOT computable from the + opaque wire markers alone** — see [Reader-set resolution](#reader-set-resolution). - **Policy check before egress.** Before a write/egress tool call, the host checks whether the current context label may flow to the call's target. When a label is absent, the host falls back to its default (trusted-action) policy rather than assuming the worst — labels are an *additive* signal. -- **Confidentiality resolution.** `"private"` is intentionally opaque on the - wire; resolving it to a concrete reader set (e.g. via a collaborators lookup) - is a host concern, so servers need not enumerate audiences inline. > The normative integrity/confidentiality lattice definitions follow the FIDES > paper, §4.1 and §4.3. This draft references the model rather than restating > the proofs. +### Reader-set resolution + +`"private"` is intentionally opaque on the wire. Two distinct `"private"` +markers (e.g. file contents from two different private repositories) are **not +equal**, and their confidentiality join is **not** the same `"private"` token: +data derived from both may flow only to principals who can read *both* sources — +the intersection of their reader sets. The opaque marker cannot express this +intersection, so a host that needs to make a precise cross-source flow decision +MUST resolve each `"private"` marker to a concrete reader set before joining. + +Resolution is a host-side concern, performed at policy-decision time: + +1. The host maps each contributing `"private"` label back to its source (e.g. + via the `evidenceRef.ref` locator, or its own record of which tool result + carried the label). +2. The host queries the originating system for the current reader set (e.g. a + repository collaborators lookup) using its own credentials. +3. The host computes the flow decision over the resolved sets (intersection for + a join of multiple private sources) and then discards them. + +Keeping resolution host-side is what lets the wire stay at `public` / `private` +while still supporting precise multi-source decisions. + ### Relationship to `trust-annotations` `ifc-fides` never appears without a host `trust-annotations` annotation @@ -106,17 +137,24 @@ the IFC label is the precise, host-checkable evidence behind them. applies a flow policy before egress operations already exists in practice. (Linked once a public reference is available.) - **Emitter (gap / proof point):** [`github-mcp-server`](https://github.com/github/github-mcp-server) - is the candidate — it already knows repo visibility and collaborator sets, - which are exactly the confidentiality inputs. + is the candidate — it already knows repository visibility and collaborator + sets, which are the confidentiality inputs. Repository visibility is only a + *default* hint, not the whole story: a public repository can serve + sub-resources that are **not** world-readable (draft security advisories, + draft releases, the collaborator roster itself, authenticated-user fields), so + a correct emitter MUST classify **per resource returned**, not per repository. + That makes the emitter a non-trivial proof point rather than a one-line + `repo.private` read. ## Open questions - Should the label be inlinable on `_meta.ifc` directly for low-friction adoption, or always behind `evidenceRef` for schema minimalism? (Lean: permit both; `evidenceRef` is canonical, inline is a convenience.) -- How does GitHub Enterprise `internal` repo visibility map onto the - public/private/reader-set confidentiality model? (Audience is the whole org, - strictly broader than collaborators — likely falls back to default policy.) +- How does GitHub Enterprise `internal` repository visibility map onto the + `public` / `private` confidentiality model? (Audience is the whole org, + strictly broader than collaborators — likely classified `private` and resolved + host-side, or falls back to default policy.) - Registry coordination with other attestation/evidence profiles (e.g. SEP-2787) so `evidenceRef.type` values don't collide. @@ -125,3 +163,4 @@ the IFC label is the precise, host-checkable evidence behind them. | Date | Change | | ---------- | ------------------------------------------------------------ | | 2026-06-10 | Initial draft skeleton. Reframed from a top-level `ifc` extension to a `trust-annotations` `evidenceRef` profile (`ifc.fides.v1`). | +| 2026-06-15 | Confidentiality limited to `public` / `private` on the wire (dropped reader-list); added Reader-set resolution section; emitter classifies per resource, not per repository. (Review: @JoannaaKL.) | From 4967a9eccc5ac74e30780ad9bb8bd406c6cc5ea7 Mon Sep 17 00:00:00 2001 From: Sam Morrow Date: Tue, 16 Jun 2026 00:34:46 +0200 Subject: [PATCH 06/10] spec: tighten ifc-fides framing and evidenceRef canonicalization (review @Rul1an) ifc-fides: - lead Label semantics with the load-bearing wire/host split (wire markers are advisory hints; reader-set semantics are host-resolved) - state the join asymmetry as principled, not incidental: integrity join is total/wire-computable (small closed lattice); confidentiality join is partial because public is the only wire-computable case (reader set = top), while private join is the intersection the opaque markers don't carry - frame wire opaqueness as a property of the security model, not a spec limitation - add fail-closed handling when resolution is unavailable (ref absent, digest-only, source unreachable): never treat opaque labels as equal or as public; unknown/ mixed provenance classifies private; resolved reader sets are not durable grants trust-annotations: - show jcs/rfc8785 alongside cbor/rfc8949 so canonicalization reads as a per-reference envelope choice rather than a default - note type/digest/canonicalization as the local-re-derivation minimum Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- specification/draft/ifc-fides.mdx | 51 +++++++++++++++++------ specification/draft/trust-annotations.mdx | 9 +++- 2 files changed, 47 insertions(+), 13 deletions(-) diff --git a/specification/draft/ifc-fides.mdx b/specification/draft/ifc-fides.mdx index 7f9af0a..622c073 100644 --- a/specification/draft/ifc-fides.mdx +++ b/specification/draft/ifc-fides.mdx @@ -86,12 +86,24 @@ be inlined by deployments that accept the wire cost) has the shape: ### Label semantics +The load-bearing distinction is between the wire and the host: **wire markers are +advisory hints; reader-set semantics are host-resolved.** The asymmetry between +the two joins below follows from that one cut. + - **Join on accumulation.** As a session ingests labeled results, the context label is the *join* of what it has seen: integrity degrades toward `untrusted`, confidentiality narrows toward the smallest permitted reader set. - The **integrity join is total and computable from the wire values alone** - (`untrusted` dominates). The **confidentiality join is NOT computable from the - opaque wire markers alone** — see [Reader-set resolution](#reader-set-resolution). + The two joins differ in *where* they can be computed, and the difference is + principled rather than incidental: + - **Integrity join is total and wire-computable.** The integrity lattice is + small and closed (`trusted ⊑ untrusted`), so `untrusted` dominates and the + join needs nothing beyond the wire values. + - **Confidentiality join is partial and host-resolved.** Reader sets are open + and host-knowledge-dependent. `public` is the one wire-computable case, + because its reader set is universal (`⊤`): `public ⊔ anything = public`. + `private ⊔ private`, by contrast, is the *intersection* of two reader sets + that the opaque markers don't carry, so it is **not** computable from the + wire — see [Reader-set resolution](#reader-set-resolution). - **Policy check before egress.** Before a write/egress tool call, the host checks whether the current context label may flow to the call's target. When a label is absent, the host falls back to its default (trusted-action) @@ -103,13 +115,16 @@ be inlined by deployments that accept the wire cost) has the shape: ### Reader-set resolution -`"private"` is intentionally opaque on the wire. Two distinct `"private"` -markers (e.g. file contents from two different private repositories) are **not -equal**, and their confidentiality join is **not** the same `"private"` token: -data derived from both may flow only to principals who can read *both* sources — -the intersection of their reader sets. The opaque marker cannot express this -intersection, so a host that needs to make a precise cross-source flow decision -MUST resolve each `"private"` marker to a concrete reader set before joining. +`"private"` is intentionally opaque on the wire — and that opaqueness is a +property of the security model, not a limitation of the spec. A reader set is not +transmissible without policy context, so the wire shape correctly declines to +carry it. Two distinct `"private"` markers (e.g. file contents from two different +private repositories) are **not equal**, and their confidentiality join is **not** +the same `"private"` token: data derived from both may flow only to principals who +can read *both* sources — the intersection of their reader sets. The opaque marker +cannot express this intersection, so a host that needs to make a precise +cross-source flow decision MUST resolve each `"private"` marker to a concrete +reader set before joining. Resolution is a host-side concern, performed at policy-decision time: @@ -121,8 +136,19 @@ Resolution is a host-side concern, performed at policy-decision time: 3. The host computes the flow decision over the resolved sets (intersection for a join of multiple private sources) and then discards them. -Keeping resolution host-side is what lets the wire stay at `public` / `private` -while still supporting precise multi-source decisions. +**When resolution is unavailable** — the `ref` is absent, the label is +digest-only, or the originating system is unreachable at decision time — the host +MUST NOT treat two opaque labels as equal, and MUST NOT treat `"private"` as +`"public"`. It denies, prompts, or applies its configured fail-closed policy. Two +`"private"` labels are equal only once resolution proves their sources are; until +a source is established, unknown or mixed provenance classifies as `"private"`, +never defaulted to `"public"` from a repository-level shortcut. + +The resolved reader set is a decision-time read performed under the host's own +credentials. It is not a durable grant: a host SHOULD NOT cache it as one or +serialize it back into annotations or evidence unless a deployment explicitly opts +in. This keeps the wire free of user identities while still letting the host make a +precise decision when it holds source provenance plus its own credentials. ### Relationship to `trust-annotations` @@ -164,3 +190,4 @@ the IFC label is the precise, host-checkable evidence behind them. | ---------- | ------------------------------------------------------------ | | 2026-06-10 | Initial draft skeleton. Reframed from a top-level `ifc` extension to a `trust-annotations` `evidenceRef` profile (`ifc.fides.v1`). | | 2026-06-15 | Confidentiality limited to `public` / `private` on the wire (dropped reader-list); added Reader-set resolution section; emitter classifies per resource, not per repository. (Review: @JoannaaKL.) | +| 2026-06-16 | Lead the semantics with the wire-hint / host-resolved split; state the integrity-total vs confidentiality-partial asymmetry as principled (`public` = `⊤` is wire-computable, `private ⊔ private` is not); add fail-closed handling when resolution is unavailable and a no-durable-grant rule for resolved sets. (Review: @Rul1an.) | diff --git a/specification/draft/trust-annotations.mdx b/specification/draft/trust-annotations.mdx index 0fc51d4..395e3c8 100644 --- a/specification/draft/trust-annotations.mdx +++ b/specification/draft/trust-annotations.mdx @@ -101,6 +101,12 @@ the SEP-1913 discussion): This single slot subsumes the previously separate `attestationChainRef` / `policyDecisionRef` ideas — both become `type` values. +`canonicalization` is per-reference precisely so different evidence producers can +be re-derived independently. `cbor/rfc8949` and `jcs/rfc8785` (JSON +Canonicalization Scheme) are both valid envelope choices — neither is the default, +and the `type`/`digest`/`canonicalization` triple is the minimum a client needs +for local re-derivation regardless of which is used. + ### Coarse vs. rich classification (DataClass) SEP-1913 carried a four-level data classification @@ -114,7 +120,7 @@ is recovered as an `evidenceRef` profile: "type": "data-class.v1", // record resolves to e.g. { "class": "highly_confidential", "regulatory": ["hipaa"] } "digest": "sha256:…", - "canonicalization": "cbor/rfc8949" + "canonicalization": "jcs/rfc8785" // a JSON-canonicalized record; CBOR is equally valid } ``` @@ -174,3 +180,4 @@ and `evidenceRef` is how that claim is made checkable. | Date | Change | | ---------- | ------------------------------------------------------------- | | 2026-06-10 | Initial draft skeleton. Narrowed to `sensitive` + `untrusted` + `evidenceRef`; DataClass demoted to a profile; `requires_review` moved to `action-metadata`. | +| 2026-06-16 | Show `jcs/rfc8785` alongside `cbor/rfc8949` so canonicalization reads as a per-reference envelope choice, not a default; note the `type`/`digest`/`canonicalization` triple as the local-re-derivation minimum. (Review: @Rul1an.) | From 678f846e23ee5eec85ca7cbbfd92eb22fb9271ea Mon Sep 17 00:00:00 2001 From: Sam Morrow Date: Tue, 16 Jun 2026 01:23:28 +0200 Subject: [PATCH 07/10] repo: make trust-annotations the base; FIDES becomes a scheme, not a sibling MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Restructure the incubation into two extensions plus interchangeable data-labelling schemes, shipped as a stack of three PRs: - This PR (base): shared scaffolding + the trust-annotations extension. - action-metadata moves to its own stacked PR. - The IFC/FIDES work moves out of specification/draft/ entirely — it is one data-labelling scheme (ifc.fides.v1) that fills the trust-annotations evidenceRef slot, not an extension and not a sibling. It lands in a schemes/ folder built to hold alternative approaches (data-class, capability tokens, cosigning, sequence-shape, attestation) so no single academic model is baked into the wire. README, sep-disposition, related-work, trust-model, open-questions and the decision log are reframed accordingly (two extensions + a schemes folder; the range of candidate schemes drawn from the SEP-1913 thread and cited literature). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- README.md | 63 +++++--- docs/decisions.md | 28 ++++ docs/open-questions.md | 2 +- docs/related-work.md | 8 +- docs/sep-disposition.md | 9 +- docs/trust-model.md | 6 +- specification/draft/action-metadata.mdx | 134 ---------------- specification/draft/ifc-fides.mdx | 193 ------------------------ 8 files changed, 87 insertions(+), 356 deletions(-) delete mode 100644 specification/draft/action-metadata.mdx delete mode 100644 specification/draft/ifc-fides.mdx diff --git a/README.md b/README.md index 2efc8ad..4e14298 100644 --- a/README.md +++ b/README.md @@ -25,11 +25,14 @@ potential narrower first cut?" The subsequent design discussion converged on a layered answer: a small, stable annotation surface on the wire, with richer evidence kept out-of-band and referenced by a bounded pointer. -This repo follows that steer. Each concern becomes a **separate experimental -extension** with its own [reverse-DNS identifier](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md#definition), -its own reference implementation, and its own path to a future Extensions -Track SEP. Drafts can graduate independently — directly addressing the "narrower -first cut" ask without throwing away the combinatoric value of the full set. +This repo follows that steer. The schema-bearing concerns become **separate +experimental extensions**, each with its own [reverse-DNS identifier](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md#definition), +reference implementation, and path to a future Extensions Track SEP, so drafts +can graduate independently. The concrete data-labelling models that fill an +extension's evidence slot are kept separate again — as interchangeable **schemes** +rather than extensions — so no single academic model is baked into the wire. This +directly addresses the "narrower first cut" ask without throwing away the +combinatoric value of the full set. See [docs/decisions.md](docs/decisions.md) for the decision record and [docs/trust-model.md](docs/trust-model.md) for the shared enforcement model. @@ -40,19 +43,42 @@ See [docs/decisions.md](docs/decisions.md) for the decision record and | :--- | :--- | :--- | :--- | | [`io.modelcontextprotocol/trust-annotations`](specification/draft/trust-annotations.mdx) | Draft skeleton | **Primary extension.** A small, scheme-agnostic client-facing data-classification vocabulary (`sensitive`, `untrusted`) on result `_meta`, plus an optional `evidenceRef` pointer slot that carries richer payloads out-of-band. | Python SDK: [`kapil8811/mcp-trust-annotations`](https://github.com/kapil8811/mcp-trust-annotations) (138-test suite, healthcare demo, LLM usability study). | | [`io.modelcontextprotocol/action-metadata`](specification/draft/action-metadata.mdx) | Draft skeleton | `inputMetadata` / `returnMetadata` / outcome classifiers (incl. `requires_review`) on `ToolAnnotations`, describing where inputs go, where outputs originate, and what real-world effects a tool can cause. | Originally [SEP-2061 (Action Security Metadata)](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061) by [@rreichel3](https://github.com/rreichel3) — closed 2026-06-13 in favour of this extension; worked example `read_drafts` / `list_inbox` / `send_email`. | -| [`io.modelcontextprotocol/ifc-fides`](specification/draft/ifc-fides.mdx) | Draft skeleton | A **profile** of the `trust-annotations` `evidenceRef` slot: `type: "ifc.fides.v1"` carrying an integrity + confidentiality label for deterministic information-flow control, following the FIDES paper ([arXiv:2505.23643](https://arxiv.org/abs/2505.23643)). | Emitter candidate: [`github-mcp-server`](https://github.com/github/github-mcp-server) (does not emit IFC labels today — closing that gap is the proof point). | -### Why FIDES is a profile, not a top-level extension - -Information-flow control is modelled as a profile rather than the namespace -root because IFC (an integrity × confidentiality lattice) is one enforcement -model among several that reviewers raised — capability tokens, caller/tool -cosigning, and sequence-shape audit records. A top-level `ifc/` root would bake -one academic model into the namespace and foreclose the others. As one reviewer -put it, IFC "fits relatively well if you use annotations" — an endorsement of -IFC *as a profile*, not as the wire root. As a `type` value under -`trust-annotations`'s open-ended `evidenceRef` slot, the FIDES work stays -first-class while every other model can occupy the same slot. +Each extension is proposed in its own pull request so it can be reviewed and +graduate on its own clock. + +## Data-labelling schemes (the `evidenceRef` slot) + +The extensions above keep the wire vocabulary deliberately small. Richer +labelling lives **out-of-band**, referenced by the `trust-annotations` +[`evidenceRef`](specification/draft/trust-annotations.mdx) pointer, whose `type` +is an open string. A **scheme** is a concrete data-labelling or tool-annotation +approach that fills that slot under a `type` value. A scheme is **not** an +extension and not a sibling of the two above — it is one interchangeable way to +populate the evidence an extension carries, and a deployment can adopt, swap, or +ignore it without touching the extension. + +The [`schemes/`](schemes/) folder collects these approaches. **FIDES** is the +first worked example, defining `ifc.fides.v1`; it is one model among several that +reviewers and the literature have raised, and the slot is designed so any of them +can occupy it: + +| Scheme | `evidenceRef.type` | Source | +| :--- | :--- | :--- | +| FIDES information-flow control (integrity × confidentiality lattice) | `ifc.fides.v1` | [arXiv:2505.23643](https://arxiv.org/abs/2505.23643); emitter candidate [`github-mcp-server`](https://github.com/github/github-mcp-server) | +| Coarse data classification (4-level + regulatory scope) | `data-class.v1` | SEP-1913 taxonomy | +| Design-pattern controls (Plan-Then-Execute, Dual LLM, Map-Reduce) | _candidate_ | [arXiv:2506.08837](https://arxiv.org/abs/2506.08837) | +| Capability-token constraints (SINT) | _candidate_ | pshkv, [SEP-1913 thread](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913) | +| Caller/tool cosigning | _candidate_ | viftode4, SEP-1913 thread | +| Sequence-shape audit records | _candidate_ | marras0914, SEP-1913 thread | +| Tool-call attestation (in-toto / OVERT envelopes) | _candidate_ | [SEP-2787](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2787) | + +Modelling IFC as a scheme rather than a namespace root is deliberate: a top-level +`ifc/` extension would bake one academic model into the wire and foreclose the +others. As one reviewer put it, IFC "fits relatively well if you use annotations" +— an endorsement of IFC *behind* the annotation slot, not as the slot itself. See +[`schemes/README.md`](schemes/README.md) for the full list and the bar for adding +a scheme. ## Relationship to SEP-1913 @@ -84,7 +110,8 @@ This repo mirrors the structure of official extension repositories such as [`ext-auth`](https://github.com/modelcontextprotocol/ext-auth): ``` -specification/draft/.mdx # one spec per extension +specification/draft/.mdx # one spec per extension (trust-annotations, action-metadata) +schemes/ # data-labelling schemes that fill the evidenceRef slot (FIDES, …) docs/ # decision log, open questions, related work MAINTAINERS.md # IG facilitators ``` diff --git a/docs/decisions.md b/docs/decisions.md index 33ecb8b..a451955 100644 --- a/docs/decisions.md +++ b/docs/decisions.md @@ -89,3 +89,31 @@ Resolution. SEP-1862 remains a core/Standards-Track protocol change. **Rationale.** The 2026-05-28 IG meeting concluded pre-flight is inherently a protocol-level change, not an extension. + +## 2026-06-16 — FIDES is a scheme, not a sibling extension + +**Decision.** Refines the 2026-06-10 "FIDES is a profile" decision. The IFC/FIDES +work moves out of `specification/draft/` (where it sat next to the two +extensions) into a `schemes/` folder. There are **two** extensions +(`trust-annotations`, `action-metadata`); FIDES is **one data-labelling scheme** +(`ifc.fides.v1`) that fills the `trust-annotations` `evidenceRef` slot. + +**Rationale.** FIDES is one model the extension *could* use, not a peer of the +extensions, and must not be presented as a sibling. The original SEP cites it +alongside ShardGuard and "Design Patterns for Securing LLM Agents," and the +SEP-1913 thread adds capability tokens, cosigning, sequence-shape, and +attestation models — so `schemes/` is a folder for interchangeable approaches, +with FIDES as the first worked one. This shows the range the open `evidenceRef` +slot is meant to carry rather than implying IFC is the privileged model. + +## 2026-06-16 — Three pull requests, stacked + +**Decision.** The work ships as three PRs: `trust-annotations` (the base, +carrying shared repo scaffolding), `action-metadata` (stacked on the base), and +the FIDES scheme in `schemes/` (stacked on the base). The two extensions are +independent; the FIDES scheme depends on `trust-annotations` because it fills +that extension's `evidenceRef` slot. + +**Rationale.** Separate PRs let each piece be reviewed and graduate on its own +clock. FIDES stacks on `trust-annotations` because a scheme has no meaning +without the slot it fills. diff --git a/docs/open-questions.md b/docs/open-questions.md index 4d5749e..7adb9d9 100644 --- a/docs/open-questions.md +++ b/docs/open-questions.md @@ -30,7 +30,7 @@ Tracked here rather than in the spec drafts, so the drafts stay non-temporal. - Open strings vs. closed enums for `destination` / `source` / `sensitivity`. - Does `requiresReview` need a machine-readable *reason* for good client UX? -## ifc-fides +## ifc-fides (scheme) - Inline `_meta.ifc` for low-friction adoption vs. always behind `evidenceRef`. - GitHub Enterprise `internal` repo visibility → `public`/`private` mapping diff --git a/docs/related-work.md b/docs/related-work.md index fdaf230..5cdb88f 100644 --- a/docs/related-work.md +++ b/docs/related-work.md @@ -10,11 +10,11 @@ annotation work. Several were surfaced in IG meetings (notably 2026-05-28). - [SEP-1862 — Tool Resolution / pre-flight checks](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1862) — core-protocol, composes with these extensions. - [SEP-2133 — Extensions](https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/seps/2133-extensions.md) — the framework this repo incubates under. - [SEP-2127 — Server Cards](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2893) — precedent for the Standards→Extensions Track refactor. -- [SEP-2787 — Tool Call Attestation](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2787) — candidate `evidenceRef` profile. +- [SEP-2787 — Tool Call Attestation](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2787) — candidate `evidenceRef` scheme. ## Research -- **FIDES** — *Information-flow control for LLM agents.* [arXiv:2505.23643](https://arxiv.org/abs/2505.23643). Basis for the `ifc.fides.v1` profile. +- **FIDES** — *Information-flow control for LLM agents.* [arXiv:2505.23643](https://arxiv.org/abs/2505.23643). Basis for the `ifc.fides.v1` scheme in [`schemes/`](../schemes/). - **Design Patterns for Securing LLM Agents** — IBM/Google/Microsoft. [arXiv:2506.08837](https://arxiv.org/abs/2506.08837). Plan-Then-Execute, Dual LLM, Map-Reduce, etc. - **Trail of Bits** — prompt-injection via hidden content in GitHub issues. [blog](https://blog.trailofbits.com/2025/08/06/prompt-injection-engineering-for-attackers-exploiting-github-copilot/). - **OpenAI Auto Review** — https://alignment.openai.com/auto-review/ (shared in IG chat). @@ -22,7 +22,7 @@ annotation work. Several were surfaced in IG meetings (notably 2026-05-28). ## Implementations & tooling - [`kapil8811/mcp-trust-annotations`](https://github.com/kapil8811/mcp-trust-annotations) — reference Python SDK PoC for `trust-annotations`. -- [`github-mcp-server`](https://github.com/github/github-mcp-server) — public MCP server; emitter candidate for `ifc-fides` (knows repo visibility + collaborators). +- [`github-mcp-server`](https://github.com/github/github-mcp-server) — public MCP server; emitter candidate for the `ifc-fides` scheme (knows repo visibility + collaborators). - **Ethyca** data-labeling docs — https://www.ethyca.com/docs (shared in IG chat). - **GitHub Next** agentic-workflows research on data labeling — to be documented as issues in this repo (IG action item, @gokhanarkan / @joannakl). @@ -35,4 +35,4 @@ annotation work. Several were surfaced in IG meetings (notably 2026-05-28). - **Sequence-shape** policies — marras0914. These are exactly the models that `evidenceRef`'s open `type` is designed to -accommodate as profiles. +accommodate as schemes — see [`schemes/`](../schemes/). diff --git a/docs/sep-disposition.md b/docs/sep-disposition.md index 18e2505..7fe65d0 100644 --- a/docs/sep-disposition.md +++ b/docs/sep-disposition.md @@ -62,7 +62,10 @@ preference: - **(C)** Close 1913 outright and open three fresh Extensions Track SEPs. Loses the discussion history's continuity; not preferred. -**Moved into extensions:** `trust-annotations`, `action-metadata`, `ifc-fides`. +**Moved into extensions:** `trust-annotations`, `action-metadata`. +**Moved into `schemes/`:** the IFC/FIDES work, as one data-labelling **scheme** +(`ifc.fides.v1`) that fills the `trust-annotations` `evidenceRef` slot — not an +extension and not a sibling of the two above. **Parked on the umbrella:** `maliciousActivityHint`, session-level propagation rules. See [open-questions.md](./open-questions.md). @@ -90,14 +93,14 @@ with it if it lands, but do not block on it. for Tools)** — tracked by the IG as discussion items; not part of these extensions. Cross-link only. - **SEP-2787 (Tool Call Attestation)** and the various attestation/evidence - threads — these are natural `evidenceRef` *profile* candidates rather than + threads — these are natural `evidenceRef` **scheme** candidates rather than competitors. Coordinate so the `evidenceRef.type` registry can list them. ## Mapping table | SEP | Title | Proposed disposition | Extension home | | :-- | :-- | :-- | :-- | -| 1913 | Trust & Sensitivity Annotations | Umbrella thread; schema moves to extensions | `trust-annotations` (+ `ifc-fides`) | +| 1913 | Trust & Sensitivity Annotations | Umbrella thread; schema moves to extensions | `trust-annotations` (+ `schemes/ifc-fides`) | | 2061 | Action Security Metadata | **Closed 2026-06-13**; lives as extension | `action-metadata` | | 1862 | Tool Resolution (pre-flight) | Stays core / Standards Track | — (composes, no dependency) | | 1984 | Comprehensive Tool Annotations | IG discussion item | — | diff --git a/docs/trust-model.md b/docs/trust-model.md index 98774e4..8d85156 100644 --- a/docs/trust-model.md +++ b/docs/trust-model.md @@ -1,7 +1,7 @@ # Trust model -A single statement of the enforcement model shared by all extensions in this -repository, so individual specs don't re-litigate it. +A single statement of the enforcement model shared across this repository's +extensions and data-labelling schemes, so individual specs don't re-litigate it. ## Annotations are claims, not guarantees @@ -41,7 +41,7 @@ SEP-1913 thread): rather than blanket-blocking flows a policy engine is unsure about, **flag the specific call for user confirmation**. This preserves utility while keeping a human on the genuinely risky edges, and is the recommended default for `requiresReview` ([`action-metadata`](../specification/draft/action-metadata.mdx)) -and for IFC policy violations ([`ifc-fides`](../specification/draft/ifc-fides.mdx)). +and for IFC policy violations (the [`ifc-fides`](../schemes/ifc-fides.md) scheme). ## Cross-domain is the hard case diff --git a/specification/draft/action-metadata.mdx b/specification/draft/action-metadata.mdx deleted file mode 100644 index dd27eee..0000000 --- a/specification/draft/action-metadata.mdx +++ /dev/null @@ -1,134 +0,0 @@ ---- -title: Action Metadata ---- - -**Protocol Revision**: draft - -**Extension identifier:** `io.modelcontextprotocol/action-metadata` - -> ⚠️ **Experimental draft skeleton.** This carries forward -> [SEP-2061: Action Security Metadata](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061) -> by [@rreichel3](https://github.com/rreichel3) into the IG's experimental repo, -> per the May 28 2026 decision to pursue trust/privacy work as an extension -> first. SEP-2061 was [closed](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061#issuecomment-4675049171) -> on 2026-06-13 in favour of this extension as the home for the work; this draft -> is now the canonical place to discuss the field semantics. - -## Abstract - -This extension adds a small, declarative contract to a tool's static -`ToolAnnotations` describing **what the tool does with data**: where inputs may -go, where outputs originate, and what real-world outcome invoking it can cause. -Where [`trust-annotations`](./trust-annotations.mdx) classifies *data in -transit*, this extension classifies *tool behavior*. The two are complementary -and independently adoptable — a client can consume action metadata without -implementing trust annotations at all. - -## Motivation - -MCP today treats all tool calls as equivalent at the protocol level beyond the -coarse `readOnlyHint` / `destructiveHint` / `idempotentHint` / `openWorldHint` -hints. A tool that reads drafts and a tool that sends email are otherwise -indistinguishable, even though their privacy and consent implications differ -radically. Runtimes fall back to inferring risk from tool names or model -behavior, which does not scale. - -This was reinforced in the May 28 2026 IG meeting: a model often **cannot tell -whether a target is private or public**, and absent that signal it may push -content somewhere it should not. A declarative behavioral contract lets clients -and models make safer decisions without baking domain knowledge into every -model. - -The canonical worked example from SEP-2061: `read_drafts`, `list_inbox`, and -`send_email` can share an identical JSON Schema yet have completely different -security semantics — only action metadata distinguishes them. - -## Specification - -### Dependencies - -This extension annotates the existing `ToolAnnotations` object returned by -`tools/list`. It has no dependency on `trust-annotations` or on Tool Resolution. - -### Fields - -Carried under the extension-namespaced key on `ToolAnnotations`: - -```jsonc -{ - "annotations": { - "io.modelcontextprotocol/action-metadata": { - "inputMetadata": { - "destination": "external", // where input data may be stored/sent - "sensitivity": "personal" // kind of data the tool accepts - }, - "returnMetadata": { - "source": "open-world", // where returned data originates - "sensitivity": "public" - }, - "outcome": "consequential", // benign | consequential | irreversible - "requiresReview": true // host SHOULD seek human confirmation - } - } -} -``` - -| Field | Meaning | -| :--- | :--- | -| `inputMetadata.destination` | Where data passed to the tool may end up (e.g. `local`, `internal`, `external`). | -| `inputMetadata.sensitivity` | The kind of data the tool is designed to accept. | -| `returnMetadata.source` | Where the tool's returned data originates (e.g. `first-party`, `open-world`). | -| `returnMetadata.sensitivity` | The kind of data the tool is designed to return. | -| `outcome` | Real-world effect class: `benign` / `consequential` / `irreversible`. | -| `requiresReview` | The tool author signals that a host SHOULD obtain explicit human confirmation before invocation. | - -> Exact enum value sets are inherited from SEP-2061 as the starting point and -> are **not** re-litigated here. With SEP-2061 now closed, this draft is where -> they evolve. - -### `requiresReview` lives here, deliberately - -`requiresReview` is a **workflow/consent** signal, not a data-classification -property. It was intentionally moved out of [`trust-annotations`](./trust-annotations.mdx) -(which stays strictly data-classifying) to avoid reproducing SEP-1913's -"several concerns in one schema" problem at smaller scale. It sits next to -`outcome` because both describe the *act* of calling the tool rather than the -*data* in flight. - -### Lifecycle and `list_changed` - -These fields are part of the **tool definition** (`ToolAnnotations`). They are -therefore covered by `tools/list_changed`: a server that changes a tool's -action metadata MUST emit `list_changed` as it would for any tool-definition -change. (This is the opposite of `trust-annotations`, which is response-level.) - -## Relationship to existing annotations - -`outcome: irreversible` overlaps conceptually with `destructiveHint` but is -strictly richer (a three-way classification vs. a boolean) and is scoped to the -real-world effect rather than to whether the operation is destructive to -server-side state. The IG will need to decide whether action metadata -*supersedes* or *coexists with* the legacy hints before any graduation. - -## Reference implementation - -Per [SEP-2061](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2061): -the `read_drafts` / `list_inbox` / `send_email` worked example with identical -schemas and divergent action metadata. A public MCP server emitting these -fields (candidate: [`github-mcp-server`](https://github.com/github/github-mcp-server)) -would anchor the draft in a real ecosystem. - -## Open questions - -- Coexistence vs. replacement of `destructiveHint` / `readOnlyHint`. -- Whether `destination` / `source` / `sensitivity` enums should be open strings - (consistent with `evidenceRef.type`) or closed enums. -- Whether `requiresReview` needs a machine-readable *reason* (vs. a bare - boolean) to drive good client UX. - -## Changelog - -| Date | Change | -| ---------- | ------------------------------------------------------------------- | -| 2026-06-10 | Initial draft skeleton, carrying SEP-2061 into the experimental repo; absorbed `requiresReview` from the trust taxonomy. | -| 2026-06-15 | SEP-2061 closed in favour of this extension; this draft is now the canonical home for the field semantics. | diff --git a/specification/draft/ifc-fides.mdx b/specification/draft/ifc-fides.mdx deleted file mode 100644 index 622c073..0000000 --- a/specification/draft/ifc-fides.mdx +++ /dev/null @@ -1,193 +0,0 @@ ---- -title: Information-Flow Control (FIDES profile) ---- - -**Protocol Revision**: draft - -**Extension identifier:** `io.modelcontextprotocol/ifc-fides`  ·  **Profile of:** `io.modelcontextprotocol/trust-annotations` - -> ⚠️ **Experimental draft skeleton.** This defines a *profile* of the -> [`trust-annotations`](./trust-annotations.mdx) `evidenceRef` slot. It is **not** -> a standalone wire root — see [Why a profile](#why-a-profile). - -## Abstract - -This extension defines `ifc.fides.v1`, a profile of the `trust-annotations` -`evidenceRef` slot that carries an **information-flow-control label** — -integrity plus confidentiality — following the FIDES model -([arXiv:2505.23643](https://arxiv.org/abs/2505.23643)). A host that implements -deterministic information-flow control can consume these labels to decide -whether a tool call is permitted, without baking the IFC model into the core -protocol or into the `trust-annotations` wire surface. - -## Why a profile - -Information-flow control is one enforcement model among several that reviewers -of SEP-1913 raised — capability tokens, caller/tool cosigning, and -sequence-shape audit records were all put forward. A top-level extension -(`io.modelcontextprotocol/ifc`) would make the FIDES integrity × confidentiality -lattice the namespace root and silently foreclose those other models. - -As a `type` value under the open-ended `evidenceRef` slot, the FIDES label is -first-class while the slot stays free for every other model. One reviewer's -framing captured it: IFC "fits relatively well *if you use annotations*" — an -endorsement of IFC as a profile, not as the wire root. - -## Motivation - -The motivating case is the one raised in the -[2026-05-28 IG meeting](https://github.com/modelcontextprotocol/modelcontextprotocol/discussions/2820): -**a model often cannot tell whether a repository is public or private**, and -lacking that signal it may push private content to a public destination. An IFC -label lets the host track confidentiality (who may read this data) and -integrity (is this data trusted) as context accumulates across tool calls, and -deny or prompt before a flow violates policy. - -A public MCP server is the natural emitter. [`github-mcp-server`](https://github.com/github/github-mcp-server) -returns repository data whose confidentiality follows from repository visibility -and collaborator sets — the same public/private signal — but does **not** emit -IFC labels today. Closing that emitter gap is the concrete proof point for this -profile: a host-side consumer of the label shape already exists, so the missing -half is a server willing to emit it, classifying each resource it returns (see -[per-resource classification](#reference-implementation)). - -## Specification - -### Profile identity - -This profile is selected by `evidenceRef.type == "ifc.fides.v1"` on a -`trust-annotations` annotation. A client that does not implement IFC MUST be -able to ignore it safely (the surrounding `sensitive` / `untrusted` booleans and -the `digest`/`canonicalization` pair remain meaningful). - -### Label payload - -The record referenced by the `evidenceRef` (and, for low-friction adoption, MAY -be inlined by deployments that accept the wire cost) has the shape: - -```jsonc -{ - "integrity": "trusted", // "trusted" | "untrusted" (FIDES §4.1 two-level lattice) - "confidentiality": "public" // "public" | "private" -} -``` - -| Field | Meaning | -| :--- | :--- | -| `integrity` | Two-level integrity lattice (`trusted` ⊑ `untrusted`): trusted data may flow to untrusted sinks, not vice versa. | -| `confidentiality` | `"public"` = world-readable; `"private"` = an opaque marker meaning "restricted to some reader set". The concrete reader set is resolved host-side at policy-decision time (see [Reader-set resolution](#reader-set-resolution)). | - -> **Confidentiality is `public` / `private` only — never a reader list on the -> wire.** Emitting concrete reader identities (e.g. logins) is out of scope: user -> identity is not uniform across servers using different auth methods, the -> identities are themselves access-restricted data, and a single resource can -> have hundreds of readers. The opaque marker keeps the wire shape stable and the -> sensitive resolution host-side. - -### Label semantics - -The load-bearing distinction is between the wire and the host: **wire markers are -advisory hints; reader-set semantics are host-resolved.** The asymmetry between -the two joins below follows from that one cut. - -- **Join on accumulation.** As a session ingests labeled results, the context - label is the *join* of what it has seen: integrity degrades toward - `untrusted`, confidentiality narrows toward the smallest permitted reader set. - The two joins differ in *where* they can be computed, and the difference is - principled rather than incidental: - - **Integrity join is total and wire-computable.** The integrity lattice is - small and closed (`trusted ⊑ untrusted`), so `untrusted` dominates and the - join needs nothing beyond the wire values. - - **Confidentiality join is partial and host-resolved.** Reader sets are open - and host-knowledge-dependent. `public` is the one wire-computable case, - because its reader set is universal (`⊤`): `public ⊔ anything = public`. - `private ⊔ private`, by contrast, is the *intersection* of two reader sets - that the opaque markers don't carry, so it is **not** computable from the - wire — see [Reader-set resolution](#reader-set-resolution). -- **Policy check before egress.** Before a write/egress tool call, the host - checks whether the current context label may flow to the call's target. When - a label is absent, the host falls back to its default (trusted-action) - policy rather than assuming the worst — labels are an *additive* signal. - -> The normative integrity/confidentiality lattice definitions follow the FIDES -> paper, §4.1 and §4.3. This draft references the model rather than restating -> the proofs. - -### Reader-set resolution - -`"private"` is intentionally opaque on the wire — and that opaqueness is a -property of the security model, not a limitation of the spec. A reader set is not -transmissible without policy context, so the wire shape correctly declines to -carry it. Two distinct `"private"` markers (e.g. file contents from two different -private repositories) are **not equal**, and their confidentiality join is **not** -the same `"private"` token: data derived from both may flow only to principals who -can read *both* sources — the intersection of their reader sets. The opaque marker -cannot express this intersection, so a host that needs to make a precise -cross-source flow decision MUST resolve each `"private"` marker to a concrete -reader set before joining. - -Resolution is a host-side concern, performed at policy-decision time: - -1. The host maps each contributing `"private"` label back to its source (e.g. - via the `evidenceRef.ref` locator, or its own record of which tool result - carried the label). -2. The host queries the originating system for the current reader set (e.g. a - repository collaborators lookup) using its own credentials. -3. The host computes the flow decision over the resolved sets (intersection for - a join of multiple private sources) and then discards them. - -**When resolution is unavailable** — the `ref` is absent, the label is -digest-only, or the originating system is unreachable at decision time — the host -MUST NOT treat two opaque labels as equal, and MUST NOT treat `"private"` as -`"public"`. It denies, prompts, or applies its configured fail-closed policy. Two -`"private"` labels are equal only once resolution proves their sources are; until -a source is established, unknown or mixed provenance classifies as `"private"`, -never defaulted to `"public"` from a repository-level shortcut. - -The resolved reader set is a decision-time read performed under the host's own -credentials. It is not a durable grant: a host SHOULD NOT cache it as one or -serialize it back into annotations or evidence unless a deployment explicitly opts -in. This keeps the wire free of user identities while still letting the host make a -precise decision when it holds source provenance plus its own credentials. - -### Relationship to `trust-annotations` - -`ifc-fides` never appears without a host `trust-annotations` annotation -carrying the `evidenceRef`. The booleans are the universally-actionable signal; -the IFC label is the precise, host-checkable evidence behind them. - -## Reference implementation - -- **Consumer:** a host-side IFC engine that parses the `{integrity, - confidentiality}` label, maintains a context label across tool results, and - applies a flow policy before egress operations already exists in practice. - (Linked once a public reference is available.) -- **Emitter (gap / proof point):** [`github-mcp-server`](https://github.com/github/github-mcp-server) - is the candidate — it already knows repository visibility and collaborator - sets, which are the confidentiality inputs. Repository visibility is only a - *default* hint, not the whole story: a public repository can serve - sub-resources that are **not** world-readable (draft security advisories, - draft releases, the collaborator roster itself, authenticated-user fields), so - a correct emitter MUST classify **per resource returned**, not per repository. - That makes the emitter a non-trivial proof point rather than a one-line - `repo.private` read. - -## Open questions - -- Should the label be inlinable on `_meta.ifc` directly for low-friction - adoption, or always behind `evidenceRef` for schema minimalism? (Lean: - permit both; `evidenceRef` is canonical, inline is a convenience.) -- How does GitHub Enterprise `internal` repository visibility map onto the - `public` / `private` confidentiality model? (Audience is the whole org, - strictly broader than collaborators — likely classified `private` and resolved - host-side, or falls back to default policy.) -- Registry coordination with other attestation/evidence profiles - (e.g. SEP-2787) so `evidenceRef.type` values don't collide. - -## Changelog - -| Date | Change | -| ---------- | ------------------------------------------------------------ | -| 2026-06-10 | Initial draft skeleton. Reframed from a top-level `ifc` extension to a `trust-annotations` `evidenceRef` profile (`ifc.fides.v1`). | -| 2026-06-15 | Confidentiality limited to `public` / `private` on the wire (dropped reader-list); added Reader-set resolution section; emitter classifies per resource, not per repository. (Review: @JoannaaKL.) | -| 2026-06-16 | Lead the semantics with the wire-hint / host-resolved split; state the integrity-total vs confidentiality-partial asymmetry as principled (`public` = `⊤` is wire-computable, `private ⊔ private` is not); add fail-closed handling when resolution is unavailable and a no-durable-grant rule for resolved sets. (Review: @Rul1an.) | From 67f84f72e8f90bbc8e1c0b410b699ef10bbd9e8d Mon Sep 17 00:00:00 2001 From: Sam Morrow Date: Tue, 16 Jun 2026 01:30:47 +0200 Subject: [PATCH 08/10] Sync intent-comment record with the live SEP-1913 comment The posted SEP-1913 umbrella comment was updated to match the shipped structure: two extensions plus a schemes/ folder, with FIDES as one data-labelling scheme (ifc.fides.v1) rather than an extension/profile. Bring docs/intent-comment.md back in sync as the source of record and add the three stacked PR links (#2 base, #3 action-metadata, #4 FIDES scheme). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/intent-comment.md | 30 +++++++++++++++++++++--------- 1 file changed, 21 insertions(+), 9 deletions(-) diff --git a/docs/intent-comment.md b/docs/intent-comment.md index 7a99730..f3e4946 100644 --- a/docs/intent-comment.md +++ b/docs/intent-comment.md @@ -35,25 +35,37 @@ in sync with [sep-disposition.md](./sep-disposition.md). > an Extensions Track SEP. Incubation is in > [`experimental-ext-tool-annotations`](https://github.com/modelcontextprotocol/experimental-ext-tool-annotations). > > -> Kicking off with a few draft extensions in the tool annotations repo — not -> sure yet whether they'd each need separate repos eventually, or whether -> grouping them in one is fine. That's part of what incubation is for. +> It's now scaffolded as \*\*two extensions plus a `schemes/` folder\*\* of +> interchangeable data-labelling approaches, shipped as a stacked set of PRs: +> > +> - [#2](https://github.com/modelcontextprotocol/experimental-ext-tool-annotations/pull/2) — repo scaffolding + the `trust-annotations` extension (the base). +> - [#3](https://github.com/modelcontextprotocol/experimental-ext-tool-annotations/pull/3) — the `action-metadata` extension. +> - [#4](https://github.com/modelcontextprotocol/experimental-ext-tool-annotations/pull/4) — FIDES as a data-labelling \*\*scheme\*\* under `schemes/`. > > > | Extension | Scope | > |---|---| > | `trust-annotations` | The narrow data-classification taxonomy (`sensitive`, `untrusted`) + an open-ended `evidenceRef` pointer for richer, out-of-band evidence. | > | `action-metadata` | Tool I/O + outcome contract (folds in @rreichel3's SEP-2061). | -> | `ifc-fides` | Information-flow control ([arXiv:2505.23643](https://arxiv.org/abs/2505.23643)) as **one profile** of `evidenceRef`, not a wire root — the public/private-repo confidentiality case, with github-mcp-server as an emitter example. | +> > +> \*\*Data-labelling schemes (the `evidenceRef` slot).\*\* Richer evidence models +> are \*not\* extensions and \*not\* a wire root. They live in `schemes/` as +> interchangeable fillers of the `trust-annotations` `evidenceRef` slot, each +> selected by an `evidenceRef.type` value, so a deployment can adopt one, several, +> or none without changing the extension. FIDES information-flow control +> ([arXiv:2505.23643](https://arxiv.org/abs/2505.23643)) is the first worked scheme +> (`ifc.fides.v1`) — the public/private-repo confidentiality case, with +> github-mcp-server as an emitter example. The folder is built to hold the range of +> other models raised in review (coarse data classification, design-pattern +> controls, capability tokens, cosigning, sequence-shape, attestation). > > > Deliberately removed: `maliciousActivityHint` (the structural concerns raised > here are unresolved) and session-level propagation rules. > > > This follows the same Standards-Track → Extensions-Track refactor pattern as -> SEP-2127 (#2893). This PR will eventually pivot to the `trust-annotations` -> piece itself, with the other schema-bearing pieces moving out into their own -> extensions. Everything is still in the incubation phase, so naming, design, -> and the choice of what to put forward as an extension are all open for -> discussion in the IG. +> SEP-2127 (#2893). This PR is now the `trust-annotations` base of the stack; the +> `action-metadata` extension and the `schemes/` folder are stacked on it. +> Everything is still in the incubation phase, so naming, design, and the choice of +> what to put forward as an extension are all open for discussion in the IG. --- From 0be45d9b950fc6e0fcb9f7ace72dbd81f44d5ade Mon Sep 17 00:00:00 2001 From: Sam Morrow Date: Tue, 16 Jun 2026 08:56:46 +0200 Subject: [PATCH 09/10] Record SEP-1913 feedback as cited open questions; clarify untrusted vs openWorldHint Enrich the base extension docs so the early review history is preserved rather than erased by the narrow first cut: - trust-annotations.mdx: add a normative subsection distinguishing result-level untrusted from tool-definition openWorldHint (the SEP-1913 #711 distinction). - open-questions.md: capture the sensitivity-field shape debate, enforcement-vs- advisory risk, per-block byte ranges, taint persistence, and sequence-shape, each attributed to the reviewer who raised it; add a Naming (under review) section for the contested field/extension names. - related-work.md: add verified FIDES alternatives (Permissive IFC 2410.03055, AirGapAgent 2405.05175, CaMeL 2503.18813) and a schemes-vs-host-architectures distinction. - decisions.md: log the scheme/architecture boundary and the feedback-capture decision. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/decisions.md | 28 ++++++++++ docs/open-questions.md | 63 ++++++++++++++++++++--- docs/related-work.md | 14 ++++- specification/draft/trust-annotations.mdx | 20 +++++++ 4 files changed, 117 insertions(+), 8 deletions(-) diff --git a/docs/decisions.md b/docs/decisions.md index a451955..e3a574b 100644 --- a/docs/decisions.md +++ b/docs/decisions.md @@ -117,3 +117,31 @@ that extension's `evidenceRef` slot. **Rationale.** Separate PRs let each piece be reviewed and graduate on its own clock. FIDES stacks on `trust-annotations` because a scheme has no meaning without the slot it fills. + +## 2026-06-17 — Schemes carry data labels; host architectures do not + +**Decision.** `schemes/` holds **data-labelling** approaches a server attaches to +a result (FIDES, Permissive IFC, AirGapAgent, `data-class`, attestation +envelopes). **Host architectures** — control-flow designs the client/host runs +(CaMeL, the "Design Patterns for Securing LLM Agents" catalogue, Dual-LLM) — are +prior art in [`related-work.md`](./related-work.md), not candidate schemes. + +**Rationale.** A scheme produces a label; an architecture decides what to do with +one. Conflating them would invite a `schemes/camel.md` that has no per-result +payload to define. A capability token such an architecture issues can still be +*referenced* through `evidenceRef`, but the architecture itself is not a scheme. + +## 2026-06-17 — Early SEP-1913 feedback recorded as cited open questions + +**Decision.** The substantive concerns from the original issue (#711) and SEP +(#1913) review — the set-theoretic critique of linear sensitivity, org-defined +vocabularies, the class+regulatory pairing, taint persistence across storage, +per-block byte ranges, sequence-shape, and the false-security risk — are +captured with reviewer attributions in [`open-questions.md`](./open-questions.md) +rather than silently dropped by the narrower cut. + +**Rationale.** The narrow first cut (`sensitive: boolean`) deliberately omits a +lot of debated design. Recording *why*, with links to the people who raised each +point, keeps the history visible and gives each parked item a home to graduate +from (a scheme, an `action-metadata` field, or a future extension) instead of +being re-litigated from scratch. diff --git a/docs/open-questions.md b/docs/open-questions.md index 7adb9d9..8432a37 100644 --- a/docs/open-questions.md +++ b/docs/open-questions.md @@ -16,13 +16,50 @@ Tracked here rather than in the spec drafts, so the drafts stay non-temporal. ## trust-annotations -- Is `sensitive` the right single coarse signal, or do we need the - `data-class.v1` profile from day one? +- **Shape of the sensitivity signal.** Is `sensitive: boolean` the right single + coarse signal, or does the narrow cut overcorrect? The SEP-1913 thread rejected + a linear `sensitiveHint: low|medium|high` because sensitivity is set-theoretic, + not a scale ([@JustinCappos](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/711#issuecomment-2967516811): + medical results to a mail MCP *and* a card number to a payment MCP, but neither + crossed). Reviewers then pushed for org-defined vocabularies + ([@olaservo](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/711#issuecomment-2968743154), + [@Mossaka](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/711#issuecomment-2971788308)) + and a class+regulatory pairing such as `confidential:hipaa` + ([@krubenok](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#discussion_r3103485194)). + Against that, [@localden](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#issuecomment-4037623595) + warned a baked-in taxonomy is hard to remove. Open: keep the boolean and let + the [`data-class.v1` scheme](../schemes/data-class.md) carry the taxonomy, or + also expose a structured escape hatch on the wire so regulated flows are + expressible without a scheme? +- **Enforcement vs. advisory.** A self-declared `sensitive: true` from a + poorly-configured or malicious server could create a false sense of security + ([@localden](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#issuecomment-4037623595)). + [docs/trust-model.md](./trust-model.md) puts verification in registries; is + that enough without a normative client-side check path? - Content-block-level vs. result-level attachment — does the draft need a - worked multi-result example before it's implementable? + worked multi-result example before it's implementable? Per-block annotation + with **byte/codepoint ranges** was requested so clients can highlight the + flagged span, not just the whole result + ([@connor4312](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#issuecomment-3849207989)). - `list_changed`: confirmed response-level annotations don't participate; revisit only if trust vocabulary ever attaches to tool definitions. +## Naming (under review) + +These are explicitly unsettled and are being reviewed against the SEP-1913 +record before any rename lands: + +- **Umbrella name.** The original SEP was "Trust *& Sensitivity* Annotations" — + two axes (integrity *and* confidentiality). `trust-annotations` reads as the + integrity half; does the name hide the `sensitive` (confidentiality) half? +- **`untrusted` vs `openWorldHint`.** Same concept, different layer (result vs. + tool definition). Share a name/vocabulary, or keep them deliberately distinct? +- **`evidenceRef.type` vs `evidenceRef.scheme`.** The repo calls these values + "schemes" (`schemes/`); the selector field is `type`. Align the field name? +- **`evidenceRef` itself.** Its ancestor is [@pshkv](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#issuecomment-4196867926)'s + `decision_ref` / `attestation_ref` / `policy_profile`. Is `evidenceRef` the + clearest umbrella, or should the pointer name the kind of thing it references? + ## action-metadata - Coexistence vs. replacement of legacy `destructiveHint` / `readOnlyHint` / @@ -42,8 +79,20 @@ Tracked here rather than in the spec drafts, so the drafts stay non-temporal. ## Parked (SEP-1913 umbrella) -- **`maliciousActivityHint`** — if it returns, it is per-`ContentBlock` with - spans, driven by the host's own detection, not a server-attested boolean. +- **`maliciousActivityHint`** — killed in review on three grounds + ([@connor4312](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#issuecomment-3849207989)): + it can't be resolved before execution for dynamic tools; a boolean gives no + UX or ranges; and clients won't trust a server's self-report of its own + maliciousness. If it returns, it is per-`ContentBlock` with spans, driven by + the **host's** own detection, not a server-attested boolean. +- **Taint persistence across the store boundary** — a label must survive + round-tripping through storage: write a card number to a file, read it back, + and the sensitivity label must not silently disappear + ([@JustinCappos](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/711#issuecomment-2967516811)). + This is a propagation property no single response annotation can guarantee on + its own; it needs the taxonomy and `evidenceRef` stable first. - **Session-level propagation rules** — escalation semantics and the - sequence-shape gap ("this was call N in a flagged sequence" has no response - annotation surface today). + **sequence-shape** gap: "this was call N in a flagged sequence" has no response + annotation surface today ([marras0914](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913)). + A candidate `evidenceRef` scheme could carry a sequence assertion; tracked in + [`schemes/README.md`](../schemes/README.md). diff --git a/docs/related-work.md b/docs/related-work.md index 5cdb88f..858391f 100644 --- a/docs/related-work.md +++ b/docs/related-work.md @@ -15,10 +15,22 @@ annotation work. Several were surfaced in IG meetings (notably 2026-05-28). ## Research - **FIDES** — *Information-flow control for LLM agents.* [arXiv:2505.23643](https://arxiv.org/abs/2505.23643). Basis for the `ifc.fides.v1` scheme in [`schemes/`](../schemes/). -- **Design Patterns for Securing LLM Agents** — IBM/Google/Microsoft. [arXiv:2506.08837](https://arxiv.org/abs/2506.08837). Plan-Then-Execute, Dual LLM, Map-Reduce, etc. +- **Permissive Information-Flow Analysis for LLMs** — relaxes IFC join so a label propagates only when an input actually influences an output. [arXiv:2410.03055](https://arxiv.org/abs/2410.03055). Candidate `evidenceRef` scheme (per-result label), like FIDES. +- **AirGapAgent** — contextual-integrity minimisation: restrict per-task data to what the context warrants. [arXiv:2405.05175](https://arxiv.org/abs/2405.05175). Candidate scheme: emits a contextual-integrity classification per result. +- **CaMeL — Defeating Prompt Injections by Design** — capability-based control/data-flow extraction. [arXiv:2503.18813](https://arxiv.org/abs/2503.18813). A **host architecture**, not a data-label scheme (see note below); a capability token it issues could be referenced via `evidenceRef`, but the architecture itself is not a scheme. +- **Design Patterns for Securing LLM Agents** — IBM/Google/Microsoft. [arXiv:2506.08837](https://arxiv.org/abs/2506.08837). Plan-Then-Execute, Dual LLM, Map-Reduce, etc. Also **host architectures**, not schemes. - **Trail of Bits** — prompt-injection via hidden content in GitHub issues. [blog](https://blog.trailofbits.com/2025/08/06/prompt-injection-engineering-for-attackers-exploiting-github-copilot/). - **OpenAI Auto Review** — https://alignment.openai.com/auto-review/ (shared in IG chat). +### Schemes vs. host architectures + +The `evidenceRef` slot carries **data labels** — a per-result record a server +can attach (FIDES, Permissive IFC, AirGapAgent, data-class, attestation +envelopes). It does **not** carry **host architectures** — control-flow designs +the *client/host* runs (CaMeL, the Design-Patterns catalogue, Dual-LLM). These +are complementary: an architecture decides what to do with a label, the label is +what a scheme produces. Only the former belong in [`schemes/`](../schemes/). + ## Implementations & tooling - [`kapil8811/mcp-trust-annotations`](https://github.com/kapil8811/mcp-trust-annotations) — reference Python SDK PoC for `trust-annotations`. diff --git a/specification/draft/trust-annotations.mdx b/specification/draft/trust-annotations.mdx index 395e3c8..4ed0689 100644 --- a/specification/draft/trust-annotations.mdx +++ b/specification/draft/trust-annotations.mdx @@ -137,6 +137,26 @@ are present, the content-block annotation refines the result-level one for that block; it MUST NOT *weaken* a result-level claim (union semantics — once `true`, stays `true`). +### Relationship to existing `*Hint` annotations + +MCP tool definitions already carry an `openWorldHint` (alongside `readOnlyHint` / +`destructiveHint` / `idempotentHint`). `untrusted` is deliberately **not** a +synonym for `openWorldHint`: + +- `openWorldHint` is a property of the **tool definition** — "this tool reaches + an open, attacker-influenceable world" (e.g. a web fetch). It is known at + registration time and does not vary per call. +- `untrusted` is a property of a **specific result** — "*this* returned data + originated from an open-world / untrusted source." A tool whose `openWorldHint` + is `true` may still return trusted data on a given call, and a tool whose + `openWorldHint` is `false` can surface untrusted data it read from storage. + +The original SEP-1913 proposal reused `openWorldHint` for the result-level +"untrusted source" signal and then drew exactly this distinction +([issue #711](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/711)). +Whether the two should share a name or vocabulary is an +[open question](../../docs/open-questions.md). + ### Lifecycle and `list_changed` Trust annotations defined here are **response-level**: they describe a specific From d0733ef8e6b6a1117f3cfa594027c89834c78d7f Mon Sep 17 00:00:00 2001 From: Sam Morrow Date: Tue, 16 Jun 2026 10:06:40 +0200 Subject: [PATCH 10/10] Frame 'sensitive' as the lowest-common-denominator floor; recommend emitting both MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Per Sam: the coarse 'sensitive' boolean exists to give a universal, 'better than nothing' policy-enforcement floor every client can act on — a basic/general scheme — not a competitor to richer schemes. It is not either/or with the taxonomy; servers are encouraged to do both. - trust-annotations.mdx: reframe the coarse-vs-rich section around the LCD floor; add a SHOULD-emit-both recommendation (boolean + richer evidenceRef scheme); point Motivation at it; reframe the in-spec open question accordingly; changelog row. - decisions.md: log the LCD-floor + emit-both decision; correct two entries mis-dated 2026-06-17 to today (2026-06-16). - open-questions.md: reframe the sensitivity item from 'is the boolean right' to the narrower residual (is boolean + data-class scheme enough, or is a wire escape hatch needed); current lean: no escape hatch. - terminology: profile -> scheme in forward-looking text (open-questions registry line, sep-disposition SEP-2787 row); historical log entries left as-is. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/decisions.md | 23 +++++++++++++++-- docs/open-questions.md | 31 +++++++++++++---------- docs/sep-disposition.md | 2 +- specification/draft/trust-annotations.mdx | 31 ++++++++++++++++++----- 4 files changed, 63 insertions(+), 24 deletions(-) diff --git a/docs/decisions.md b/docs/decisions.md index e3a574b..e3cdd5a 100644 --- a/docs/decisions.md +++ b/docs/decisions.md @@ -118,7 +118,7 @@ that extension's `evidenceRef` slot. clock. FIDES stacks on `trust-annotations` because a scheme has no meaning without the slot it fills. -## 2026-06-17 — Schemes carry data labels; host architectures do not +## 2026-06-16 — Schemes carry data labels; host architectures do not **Decision.** `schemes/` holds **data-labelling** approaches a server attaches to a result (FIDES, Permissive IFC, AirGapAgent, `data-class`, attestation @@ -131,7 +131,7 @@ one. Conflating them would invite a `schemes/camel.md` that has no per-result payload to define. A capability token such an architecture issues can still be *referenced* through `evidenceRef`, but the architecture itself is not a scheme. -## 2026-06-17 — Early SEP-1913 feedback recorded as cited open questions +## 2026-06-16 — Early SEP-1913 feedback recorded as cited open questions **Decision.** The substantive concerns from the original issue (#711) and SEP (#1913) review — the set-theoretic critique of linear sensitivity, org-defined @@ -145,3 +145,22 @@ lot of debated design. Recording *why*, with links to the people who raised each point, keeps the history visible and gives each parked item a home to graduate from (a scheme, an `action-metadata` field, or a future extension) instead of being re-litigated from scratch. + +## 2026-06-16 — `sensitive` is a lowest-common-denominator floor; emit both + +**Decision.** The coarse `sensitive` boolean is intentionally a +lowest-common-denominator signal — a universal, always-actionable floor that +supports a basic "better than nothing" egress/consent policy even against a +barely-known server. It is the basic, general scheme every participant +understands, **not** a competitor to richer schemes. Servers SHOULD emit **both** +the boolean and a richer `evidenceRef` scheme (e.g. `data-class.v1`, +`ifc.fides.v1`) where they can; `sensitive` MUST NOT be dropped merely because a +scheme is present. + +**Rationale.** Clarifies the original purpose of the boolean (raised by Sam): the +point of keeping it on the wire was never to *replace* richer classification but +to guarantee a floor any client can act on. Richer schemes are strictly more +capable but are not universally implemented, so they cannot be the floor — +layering the two gives universal actionability without capping what advanced +hosts can do. This also answers the "boolean vs. richer taxonomy" tension from +SEP-1913 review: it is not either/or, it is both, at different layers. diff --git a/docs/open-questions.md b/docs/open-questions.md index 8432a37..86898e7 100644 --- a/docs/open-questions.md +++ b/docs/open-questions.md @@ -10,27 +10,30 @@ Tracked here rather than in the spec drafts, so the drafts stay non-temporal. - **Cross-domain integrity verification** — is asymmetric crypto for domain identity in scope for a future extension, or out of scope entirely? CLI tools remain a persistent gap for enforcing these constraints. -- **`evidenceRef.type` registry** — who curates the list of well-known profile +- **`evidenceRef.type` registry** — who curates the list of well-known scheme types, and how do we coordinate with attestation SEPs (e.g. SEP-2787) so values don't collide? ## trust-annotations -- **Shape of the sensitivity signal.** Is `sensitive: boolean` the right single - coarse signal, or does the narrow cut overcorrect? The SEP-1913 thread rejected - a linear `sensitiveHint: low|medium|high` because sensitivity is set-theoretic, - not a scale ([@JustinCappos](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/711#issuecomment-2967516811): - medical results to a mail MCP *and* a card number to a payment MCP, but neither - crossed). Reviewers then pushed for org-defined vocabularies +- **Sensitivity beyond the floor.** The `sensitive` boolean is settled as the + **lowest-common-denominator floor** — a basic, general signal every client can + act on, "better than nothing," with servers encouraged to *also* emit a richer + scheme (see the [decision log](./decisions.md) and the "emit both" guidance in + the spec). The residual open question is narrower: is "coarse boolean + richer + `evidenceRef` scheme" sufficient for regulated flows, or do some hosts need the + classification expressible *on the wire* without a scheme? Current lean: **no + wire escape hatch** — keep the wire floor un-rottable and push the taxonomy + into [`data-class.v1`](../schemes/data-class.md). Background on why a single + scalar/enum was rejected: sensitivity is set-theoretic not linear + ([@JustinCappos](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/711#issuecomment-2967516811)), + reviewers wanted org-defined vocabularies ([@olaservo](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/711#issuecomment-2968743154), [@Mossaka](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/711#issuecomment-2971788308)) - and a class+regulatory pairing such as `confidential:hipaa` - ([@krubenok](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#discussion_r3103485194)). - Against that, [@localden](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#issuecomment-4037623595) - warned a baked-in taxonomy is hard to remove. Open: keep the boolean and let - the [`data-class.v1` scheme](../schemes/data-class.md) carry the taxonomy, or - also expose a structured escape hatch on the wire so regulated flows are - expressible without a scheme? + and a class+regulatory pairing + ([@krubenok](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#discussion_r3103485194)), + and a baked-in taxonomy is hard to remove + ([@localden](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#issuecomment-4037623595)). - **Enforcement vs. advisory.** A self-declared `sensitive: true` from a poorly-configured or malicious server could create a false sense of security ([@localden](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#issuecomment-4037623595)). diff --git a/docs/sep-disposition.md b/docs/sep-disposition.md index 7fe65d0..7a1fe83 100644 --- a/docs/sep-disposition.md +++ b/docs/sep-disposition.md @@ -105,7 +105,7 @@ with it if it lands, but do not block on it. | 1862 | Tool Resolution (pre-flight) | Stays core / Standards Track | — (composes, no dependency) | | 1984 | Comprehensive Tool Annotations | IG discussion item | — | | 2417 | Model Preferences for Tools | IG discussion item | — | -| 2787 | Tool Call Attestation | Candidate `evidenceRef` profile | (future) | +| 2787 | Tool Call Attestation | Candidate `evidenceRef` scheme | (future) | ## Intent comment diff --git a/specification/draft/trust-annotations.mdx b/specification/draft/trust-annotations.mdx index 4ed0689..6fe939e 100644 --- a/specification/draft/trust-annotations.mdx +++ b/specification/draft/trust-annotations.mdx @@ -40,7 +40,10 @@ broadly-applicable signals cover the majority of client-actionable cases: Drives prompt-injection defenses. Anything richer than these two booleans is deliberately **not** on the wire; it -hangs off `evidenceRef` (see below). +hangs off `evidenceRef` (see below). These two signals are a +**lowest-common-denominator floor**, not a ceiling: servers are encouraged to +*also* attach a richer scheme via `evidenceRef` — see +[Coarse vs. rich](#coarse-vs-rich-classification-dataclass). ## Specification @@ -113,7 +116,7 @@ SEP-1913 carried a four-level data classification (`public` / `personal` / `confidential` / `highly_confidential`) plus a regulatory scope (e.g. `confidential:hipaa`). **This extension deliberately keeps only the coarse `sensitive` boolean on the wire.** Richer classification -is recovered as an `evidenceRef` profile: +is recovered as an `evidenceRef` scheme: ```jsonc "evidenceRef": { @@ -124,9 +127,19 @@ is recovered as an `evidenceRef` profile: } ``` -This is an explicit scope decision: the binary lives on the wire for universal -client actionability; the taxonomy lives behind a profile so it can evolve -without a breaking schema change. +This is an explicit scope decision. The boolean is the **lowest-common- +denominator signal**: a universal, always-actionable floor that lets any client +apply a basic egress/consent policy that is *better than nothing*, even against a +server it knows little about. It can be thought of as the basic, general scheme +that every participant understands. Richer schemes are strictly more capable but +are not universally implemented, so they cannot be the floor. + +Servers SHOULD therefore emit **both** when they can: the coarse `sensitive` +boolean for universal actionability, **and** a richer `evidenceRef` scheme (e.g. +`data-class.v1`, `ifc.fides.v1`) for hosts that implement it. The two are +layered, not alternatives — a client that understands the scheme uses it; one +that does not still has the boolean. `sensitive` MUST NOT be omitted merely +because a richer scheme is present. ### Attachment point @@ -189,8 +202,11 @@ and `evidenceRef` is how that claim is made checkable. ## Open questions -- Is `sensitive` the right single coarse signal, or do we need `sensitive` + - the `data-class.v1` profile from day one? +- `sensitive` is settled as the lowest-common-denominator floor; servers are + encouraged to emit it **and** a richer scheme. Residual: is the + [`data-class.v1` scheme](../../schemes/data-class.md) enough, or do some + regulated flows need the classification *on the wire*? See + [docs/open-questions.md](../../docs/open-questions.md). - Exact required-vs-recommended split on `evidenceRef` fields. - Whether content-block-level annotation needs a worked multi-result example before the draft is implementable. @@ -201,3 +217,4 @@ and `evidenceRef` is how that claim is made checkable. | ---------- | ------------------------------------------------------------- | | 2026-06-10 | Initial draft skeleton. Narrowed to `sensitive` + `untrusted` + `evidenceRef`; DataClass demoted to a profile; `requires_review` moved to `action-metadata`. | | 2026-06-16 | Show `jcs/rfc8785` alongside `cbor/rfc8949` so canonicalization reads as a per-reference envelope choice, not a default; note the `type`/`digest`/`canonicalization` triple as the local-re-derivation minimum. (Review: @Rul1an.) | +| 2026-06-16 | Add `untrusted` vs `openWorldHint` relationship subsection; frame `sensitive` as the lowest-common-denominator floor and recommend servers emit both the boolean and a richer `evidenceRef` scheme. |