From 41211a8107964a630d4c688fdb61c3c39e7306ba Mon Sep 17 00:00:00 2001 From: Sam Morrow Date: Tue, 16 Jun 2026 01:26:49 +0200 Subject: [PATCH 1/3] Add FIDES as a data-labelling scheme under schemes/ Move the FIDES information-flow-control draft out of specification/draft/ and into schemes/ifc-fides.md, reframed from a sibling extension into one data-labelling scheme that fills the trust-annotations evidenceRef slot (type ifc.fides.v1). FIDES is not an extension and not a peer of the extensions; it is one interchangeable filler of the slot. Add schemes/README.md as the index for the folder: it states what a scheme is, lists FIDES as the worked example, records the range of candidate schemes raised in SEP-1913 review (data classification, design-pattern controls, ShardGuard, capability tokens, cosigning, sequence-shape, attestation), and sets the bar a scheme doc must meet. The reviewed normative body (label payload, semantics, reader-set resolution, reference implementation, open questions) is preserved verbatim from the JoannaaKL / Rul1an review rounds. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- schemes/README.md | 55 ++++++++++++ schemes/ifc-fides.md | 198 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 253 insertions(+) create mode 100644 schemes/README.md create mode 100644 schemes/ifc-fides.md diff --git a/schemes/README.md b/schemes/README.md new file mode 100644 index 0000000..a7d63f1 --- /dev/null +++ b/schemes/README.md @@ -0,0 +1,55 @@ +# Data-labelling schemes + +A **scheme** is a concrete data-labelling or tool-annotation approach that fills +the [`trust-annotations`](../specification/draft/trust-annotations.mdx) +`evidenceRef` slot under an `evidenceRef.type` value. The extension defines a +small, stable wire vocabulary and an open `type` pointer; a scheme defines the +richer, out-of-band record that pointer resolves to. + +Schemes are **not** extensions and **not** siblings of the extensions. They are +interchangeable: a deployment can adopt one, several, or none, and can swap them +without changing the extension. Modelling each labelling approach as a scheme keeps +any single academic model out of the wire root, which is the reason FIDES lives +here rather than as a top-level extension. + +## Schemes here + +| Scheme | `evidenceRef.type` | Status | Source | +| :--- | :--- | :--- | :--- | +| [FIDES information-flow control](./ifc-fides.md) | `ifc.fides.v1` | Draft skeleton | [arXiv:2505.23643](https://arxiv.org/abs/2505.23643) | + +## Candidate schemes (not yet drafted) + +The open `type` slot is designed to carry the range of models raised in the +[SEP-1913](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913) +review and the surrounding literature. Each is a candidate for its own scheme doc: + +| Approach | Likely `type` | Source | +| :--- | :--- | :--- | +| Coarse data classification (level + regulatory scope) | `data-class.v1` | SEP-1913 taxonomy (e.g. `confidential:hipaa` shape) | +| Design-pattern controls (Plan-Then-Execute, Dual LLM, Map-Reduce) | — | [arXiv:2506.08837](https://arxiv.org/abs/2506.08837) | +| ShardGuard | — | cited in SEP-1913 | +| Capability-token constraints (SINT) | — | SEP-1913 review thread | +| Caller/tool cosigning | — | SEP-1913 review thread | +| Sequence-shape audit records | — | SEP-1913 review thread | +| Tool-call attestation (in-toto / OVERT envelopes) | — | [SEP-2787](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2787), in-toto, OVERT | + +These are leads, not commitments. A candidate becomes a scheme when someone +drafts it to the bar below; until then the slot simply stays open for it. + +## Bar for adding a scheme + +A scheme doc should state: + +1. **Identity** — the `evidenceRef.type` value it claims, and that it is selected + by `evidenceRef.type == ""` on a `trust-annotations` annotation. +2. **Payload** — the shape of the record the `evidenceRef` resolves to. +3. **Graceful degradation** — how a client that does not implement the scheme + ignores it safely (the `trust-annotations` booleans and + `digest`/`canonicalization` pair remain meaningful regardless). +4. **Producer/consumer** — at least a candidate emitter and consumer, so the + scheme is validated against real implementations rather than asserted. + +`type` values are coordinated through the non-binding `evidenceRef.type` registry +noted in [`trust-annotations`](../specification/draft/trust-annotations.mdx) so +they don't collide. diff --git a/schemes/ifc-fides.md b/schemes/ifc-fides.md new file mode 100644 index 0000000..26bd1ed --- /dev/null +++ b/schemes/ifc-fides.md @@ -0,0 +1,198 @@ +--- +title: FIDES information-flow control (data-labelling scheme) +--- + +> ⚠️ **Experimental scheme skeleton.** This is one **data-labelling scheme** that +> fills the [`trust-annotations`](../specification/draft/trust-annotations.mdx) +> `evidenceRef` slot via `type: "ifc.fides.v1"`. It is **not** an MCP extension +> and **not** a sibling of the extensions — it is one interchangeable way to +> populate the evidence a `trust-annotations` annotation carries. See +> [Why a scheme](#why-a-scheme-not-an-extension). + +**Scheme** — fills `trust-annotations`'s `evidenceRef` slot, selected by +`evidenceRef.type == "ifc.fides.v1"`. Not an extension. + +## Abstract + +This scheme defines `ifc.fides.v1`, an entry for the `trust-annotations` +`evidenceRef` slot that carries an **information-flow-control label** — integrity +plus confidentiality — following the FIDES model +([arXiv:2505.23643](https://arxiv.org/abs/2505.23643)). A host that implements +deterministic information-flow control can consume these labels to decide whether +a tool call is permitted, without baking the IFC model into the core protocol or +into the `trust-annotations` wire surface. + +## Why a scheme, not an extension + +Information-flow control is one enforcement model among several that reviewers of +SEP-1913 raised — capability tokens, caller/tool cosigning, and sequence-shape +audit records were all put forward, and the literature adds more (ShardGuard, the +"Design Patterns for Securing LLM Agents" controls). A top-level extension +(`io.modelcontextprotocol/ifc`) would make the FIDES integrity × confidentiality +lattice the namespace root and silently foreclose those other models. + +As a `type` value behind the open-ended `evidenceRef` slot, the FIDES label is +first-class while the slot stays free for every other scheme. One reviewer's +framing captured it: IFC "fits relatively well *if you use annotations*" — an +endorsement of IFC as a scheme behind the slot, not as the wire root. The +[`schemes/`](./README.md) folder is where these interchangeable approaches live. + +## Motivation + +The motivating case is the one raised in the +[2026-05-28 IG meeting](https://github.com/modelcontextprotocol/modelcontextprotocol/discussions/2820): +**a model often cannot tell whether a repository is public or private**, and +lacking that signal it may push private content to a public destination. An IFC +label lets the host track confidentiality (who may read this data) and +integrity (is this data trusted) as context accumulates across tool calls, and +deny or prompt before a flow violates policy. + +A public MCP server is the natural emitter. [`github-mcp-server`](https://github.com/github/github-mcp-server) +returns repository data whose confidentiality follows from repository visibility +and collaborator sets — the same public/private signal — but does **not** emit +IFC labels today. Closing that emitter gap is the concrete proof point for this +scheme: a host-side consumer of the label shape already exists, so the missing +half is a server willing to emit it, classifying each resource it returns (see +[per-resource classification](#reference-implementation)). + +## Specification + +### Scheme identity + +This scheme is selected by `evidenceRef.type == "ifc.fides.v1"` on a +`trust-annotations` annotation. A client that does not implement IFC MUST be +able to ignore it safely (the surrounding `sensitive` / `untrusted` booleans and +the `digest`/`canonicalization` pair remain meaningful). + +### Label payload + +The record referenced by the `evidenceRef` (and, for low-friction adoption, MAY +be inlined by deployments that accept the wire cost) has the shape: + +```jsonc +{ + "integrity": "trusted", // "trusted" | "untrusted" (FIDES §4.1 two-level lattice) + "confidentiality": "public" // "public" | "private" +} +``` + +| Field | Meaning | +| :--- | :--- | +| `integrity` | Two-level integrity lattice (`trusted` ⊑ `untrusted`): trusted data may flow to untrusted sinks, not vice versa. | +| `confidentiality` | `"public"` = world-readable; `"private"` = an opaque marker meaning "restricted to some reader set". The concrete reader set is resolved host-side at policy-decision time (see [Reader-set resolution](#reader-set-resolution)). | + +> **Confidentiality is `public` / `private` only — never a reader list on the +> wire.** Emitting concrete reader identities (e.g. logins) is out of scope: user +> identity is not uniform across servers using different auth methods, the +> identities are themselves access-restricted data, and a single resource can +> have hundreds of readers. The opaque marker keeps the wire shape stable and the +> sensitive resolution host-side. + +### Label semantics + +The load-bearing distinction is between the wire and the host: **wire markers are +advisory hints; reader-set semantics are host-resolved.** The asymmetry between +the two joins below follows from that one cut. + +- **Join on accumulation.** As a session ingests labeled results, the context + label is the *join* of what it has seen: integrity degrades toward + `untrusted`, confidentiality narrows toward the smallest permitted reader set. + The two joins differ in *where* they can be computed, and the difference is + principled rather than incidental: + - **Integrity join is total and wire-computable.** The integrity lattice is + small and closed (`trusted ⊑ untrusted`), so `untrusted` dominates and the + join needs nothing beyond the wire values. + - **Confidentiality join is partial and host-resolved.** Reader sets are open + and host-knowledge-dependent. `public` is the one wire-computable case, + because its reader set is universal (`⊤`): `public ⊔ anything = public`. + `private ⊔ private`, by contrast, is the *intersection* of two reader sets + that the opaque markers don't carry, so it is **not** computable from the + wire — see [Reader-set resolution](#reader-set-resolution). +- **Policy check before egress.** Before a write/egress tool call, the host + checks whether the current context label may flow to the call's target. When + a label is absent, the host falls back to its default (trusted-action) + policy rather than assuming the worst — labels are an *additive* signal. + +> The normative integrity/confidentiality lattice definitions follow the FIDES +> paper, §4.1 and §4.3. This scheme references the model rather than restating +> the proofs. + +### Reader-set resolution + +`"private"` is intentionally opaque on the wire — and that opaqueness is a +property of the security model, not a limitation of the scheme. A reader set is not +transmissible without policy context, so the wire shape correctly declines to +carry it. Two distinct `"private"` markers (e.g. file contents from two different +private repositories) are **not equal**, and their confidentiality join is **not** +the same `"private"` token: data derived from both may flow only to principals who +can read *both* sources — the intersection of their reader sets. The opaque marker +cannot express this intersection, so a host that needs to make a precise +cross-source flow decision MUST resolve each `"private"` marker to a concrete +reader set before joining. + +Resolution is a host-side concern, performed at policy-decision time: + +1. The host maps each contributing `"private"` label back to its source (e.g. + via the `evidenceRef.ref` locator, or its own record of which tool result + carried the label). +2. The host queries the originating system for the current reader set (e.g. a + repository collaborators lookup) using its own credentials. +3. The host computes the flow decision over the resolved sets (intersection for + a join of multiple private sources) and then discards them. + +**When resolution is unavailable** — the `ref` is absent, the label is +digest-only, or the originating system is unreachable at decision time — the host +MUST NOT treat two opaque labels as equal, and MUST NOT treat `"private"` as +`"public"`. It denies, prompts, or applies its configured fail-closed policy. Two +`"private"` labels are equal only once resolution proves their sources are; until +a source is established, unknown or mixed provenance classifies as `"private"`, +never defaulted to `"public"` from a repository-level shortcut. + +The resolved reader set is a decision-time read performed under the host's own +credentials. It is not a durable grant: a host SHOULD NOT cache it as one or +serialize it back into annotations or evidence unless a deployment explicitly opts +in. This keeps the wire free of user identities while still letting the host make a +precise decision when it holds source provenance plus its own credentials. + +### Relationship to `trust-annotations` + +The `ifc.fides.v1` label never appears without a host `trust-annotations` +annotation carrying the `evidenceRef`. The booleans are the universally-actionable +signal; the IFC label is the precise, host-checkable evidence behind them. + +## Reference implementation + +- **Consumer:** a host-side IFC engine that parses the `{integrity, + confidentiality}` label, maintains a context label across tool results, and + applies a flow policy before egress operations already exists in practice. + (Linked once a public reference is available.) +- **Emitter (gap / proof point):** [`github-mcp-server`](https://github.com/github/github-mcp-server) + is the candidate — it already knows repository visibility and collaborator + sets, which are the confidentiality inputs. Repository visibility is only a + *default* hint, not the whole story: a public repository can serve + sub-resources that are **not** world-readable (draft security advisories, + draft releases, the collaborator roster itself, authenticated-user fields), so + a correct emitter MUST classify **per resource returned**, not per repository. + That makes the emitter a non-trivial proof point rather than a one-line + `repo.private` read. + +## Open questions + +- Should the label be inlinable on `_meta.ifc` directly for low-friction + adoption, or always behind `evidenceRef` for schema minimalism? (Lean: + permit both; `evidenceRef` is canonical, inline is a convenience.) +- How does GitHub Enterprise `internal` repository visibility map onto the + `public` / `private` confidentiality model? (Audience is the whole org, + strictly broader than collaborators — likely classified `private` and resolved + host-side, or falls back to default policy.) +- Registry coordination with other evidence schemes (e.g. SEP-2787) so + `evidenceRef.type` values don't collide. + +## Changelog + +| Date | Change | +| ---------- | ------------------------------------------------------------ | +| 2026-06-10 | Initial draft skeleton. Reframed from a top-level `ifc` extension to a `trust-annotations` `evidenceRef` entry (`ifc.fides.v1`). | +| 2026-06-15 | Confidentiality limited to `public` / `private` on the wire (dropped reader-list); added Reader-set resolution section; emitter classifies per resource, not per repository. (Review: @JoannaaKL.) | +| 2026-06-16 | Lead the semantics with the wire-hint / host-resolved split; state the integrity-total vs confidentiality-partial asymmetry as principled (`public` = `⊤` is wire-computable, `private ⊔ private` is not); add fail-closed handling when resolution is unavailable and a no-durable-grant rule for resolved sets. (Review: @Rul1an.) | +| 2026-06-16 | Moved out of `specification/draft/` into `schemes/`; reframed from a sibling extension draft into a data-labelling scheme — one filler of the `trust-annotations` `evidenceRef` slot, not a peer extension. | From 59597f94dc6c52563eb8adaa1dbc292ee7965e74 Mon Sep 17 00:00:00 2001 From: Sam Morrow Date: Tue, 16 Jun 2026 08:58:29 +0200 Subject: [PATCH 2/3] Add data-class scheme skeleton; separate label-schemes from host architectures - schemes/data-class.md: skeleton for the structured classification taxonomy the trust-annotations 'sensitive' boolean deliberately omits (class + regulatory scope + org labels), shaped as open questions and attributed to the SEP-1913 reviewers who pushed for it (JustinCappos, olaservo, Mossaka, krubenok, localden). - schemes/README.md: list data-class under 'Schemes here'; add verified label- scheme candidates (Permissive IFC 2410.03055, AirGapAgent 2405.05175); move CaMeL and the design-pattern catalogue into a new 'Not schemes: host architectures' section, since they govern host control-flow rather than producing a per-result data label. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- schemes/README.md | 20 +++++-- schemes/data-class.md | 123 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 140 insertions(+), 3 deletions(-) create mode 100644 schemes/data-class.md diff --git a/schemes/README.md b/schemes/README.md index a7d63f1..b870e71 100644 --- a/schemes/README.md +++ b/schemes/README.md @@ -17,17 +17,19 @@ here rather than as a top-level extension. | Scheme | `evidenceRef.type` | Status | Source | | :--- | :--- | :--- | :--- | | [FIDES information-flow control](./ifc-fides.md) | `ifc.fides.v1` | Draft skeleton | [arXiv:2505.23643](https://arxiv.org/abs/2505.23643) | +| [Data classification](./data-class.md) | `data-class.v1` | Draft skeleton | SEP-1913 taxonomy (`class` + regulatory scope) | ## Candidate schemes (not yet drafted) The open `type` slot is designed to carry the range of models raised in the [SEP-1913](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913) -review and the surrounding literature. Each is a candidate for its own scheme doc: +review and the surrounding literature. Each is a candidate for its own scheme doc. +A scheme produces a **per-result data label** a server attaches: | Approach | Likely `type` | Source | | :--- | :--- | :--- | -| Coarse data classification (level + regulatory scope) | `data-class.v1` | SEP-1913 taxonomy (e.g. `confidential:hipaa` shape) | -| Design-pattern controls (Plan-Then-Execute, Dual LLM, Map-Reduce) | — | [arXiv:2506.08837](https://arxiv.org/abs/2506.08837) | +| Permissive information-flow labels (influence-based propagation) | `ifc.permissive.v1` | [arXiv:2410.03055](https://arxiv.org/abs/2410.03055) | +| Contextual-integrity classification (per-task minimisation) | `ci.airgap.v1` | [arXiv:2405.05175](https://arxiv.org/abs/2405.05175) | | ShardGuard | — | cited in SEP-1913 | | Capability-token constraints (SINT) | — | SEP-1913 review thread | | Caller/tool cosigning | — | SEP-1913 review thread | @@ -37,6 +39,18 @@ review and the surrounding literature. Each is a candidate for its own scheme do These are leads, not commitments. A candidate becomes a scheme when someone drafts it to the bar below; until then the slot simply stays open for it. +## Not schemes: host architectures + +Some models raised in the same discussion are **host/client control-flow +architectures**, not data labels — they decide what to *do* with information, +they don't produce a per-result record a server attaches. They are prior art (see +[`related-work.md`](../docs/related-work.md)), not entries here: + +| Approach | Why it isn't a scheme | Source | +| :--- | :--- | :--- | +| CaMeL (capability-based control/data-flow) | A host runtime; a capability token it issues *could* be referenced via `evidenceRef`, but the architecture isn't a label | [arXiv:2503.18813](https://arxiv.org/abs/2503.18813) | +| Design-pattern controls (Plan-Then-Execute, Dual LLM, Map-Reduce) | Client-side execution patterns, nothing on the wire | [arXiv:2506.08837](https://arxiv.org/abs/2506.08837) | + ## Bar for adding a scheme A scheme doc should state: diff --git a/schemes/data-class.md b/schemes/data-class.md new file mode 100644 index 0000000..74bf932 --- /dev/null +++ b/schemes/data-class.md @@ -0,0 +1,123 @@ +--- +title: Data classification (data-labelling scheme) +--- + +> ⚠️ **Experimental scheme skeleton.** This is one **data-labelling scheme** that +> fills the [`trust-annotations`](../specification/draft/trust-annotations.mdx) +> `evidenceRef` slot via `type: "data-class.v1"`. It is **not** an MCP extension +> and **not** a sibling of the extensions — it is the home for the richer +> classification taxonomy that the extension's coarse `sensitive` boolean +> deliberately leaves out. See [Why a scheme](#why-a-scheme-not-a-wire-field). + +**Scheme** — fills `trust-annotations`'s `evidenceRef` slot, selected by +`evidenceRef.type == "data-class.v1"`. Not an extension. + +## Abstract + +This scheme defines `data-class.v1`, an entry for the `trust-annotations` +`evidenceRef` slot that carries a **structured data classification**: a coarse +sensitivity level plus zero or more regulatory scopes and optional org-defined +labels. It is the structured counterpart to the extension's single +`sensitive: boolean`, for deployments that need to express *which* regulated +category a result falls under (e.g. HIPAA, GDPR, PCI-DSS) without putting that +taxonomy on the wire for every server. + +## Why a scheme, not a wire field + +The `trust-annotations` extension keeps only `sensitive: boolean` on the wire. +That coarse signal is universally client-actionable and cheap, and — critically +— it cannot become wrong as regulation changes. A richer classification was +asked for repeatedly in review, in three different shapes: + +- a **linear** `sensitiveHint: low | medium | high`, which + [@JustinCappos](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/711#issuecomment-2967516811) + rejected: sensitivity is set-theoretic (a card number and a medical record are + both sensitive but to *different* readers), not a single scale; +- **org-defined vocabularies** rather than a fixed enum + ([@olaservo](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/711#issuecomment-2968743154); + [@Mossaka](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/711#issuecomment-2971788308) + proposed reverse-DNS `classification.labels`); +- a **class + regulatory scope** pairing such as `confidential:hipaa` + ([@krubenok](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#discussion_r3103485194); + carried into SEP-2061 by @rreichel3). + +Against all three, [@localden](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1913#issuecomment-4037623595) +warned that a taxonomy baked into the protocol is very hard to remove if it turns +out wrong. A scheme resolves the tension: the taxonomy evolves behind a `type` +value, on its own clock, while the wire keeps the boolean that can never rot. + +## Motivation + +A regulated deployment needs to know not just *that* a result is sensitive but +*under which regime*, so the host can apply the matching control (HIPAA minimum- +necessary, GDPR purpose limitation, PCI-DSS storage rules). The boolean cannot +carry that, and encoding it inline would force every server and client to agree +on a regulatory taxonomy. This scheme lets a server that already knows the +classification (a healthcare record store, a payments API) attach it as evidence, +while a client that does not implement the scheme still sees `sensitive: true`. + +## Specification + +### Scheme identity + +A `data-class.v1` record is selected by `evidenceRef.type == "data-class.v1"` on +a `trust-annotations` annotation whose `sensitive` is `true`. The +`digest`/`canonicalization` pair over the record is required exactly as for any +other scheme. + +### Payload (skeleton) + +The record is a JSON object. Shapes below are **open questions**, not settled: + +```json +{ + "class": "confidential", + "regulatory": ["hipaa"], + "labels": ["com.example.pii.ssn"], + "policyRef": "https://policy.example.com/classes/confidential" +} +``` + +- `class` — a coarse level. Candidate set `public | internal | confidential | + restricted` (four levels, from the original SEP-1913 taxonomy). Whether the set + is fixed or registry-curated is open. +- `regulatory` — zero or more regulatory scopes the result falls under. Open + strings, not an enum (so a new regime doesn't need a schema change). This is + the `:scope` half of @krubenok's `confidential:hipaa`. +- `labels` — optional org-defined tags, reverse-DNS namespaced per @Mossaka, for + vocabularies a deployment defines for itself. +- `policyRef` — optional pointer to the policy that defines the classes, so the + record is interpretable by a host that hasn't pre-agreed the vocabulary. + +### Graceful degradation + +A client that does not implement `data-class.v1` ignores the record and relies on +the `trust-annotations` `sensitive` boolean, which remains meaningful on its own. +The scheme only ever *refines* the boolean; it never contradicts it (a +`data-class.v1` record only appears where `sensitive` is already `true`). + +## Producer / consumer + +- **Producer (candidate).** A server fronting a regulated store (health records, + payments) that already classifies its data internally. +- **Consumer (candidate).** A host policy engine that maps `class` + `regulatory` + to a control (block, redact, prompt). The same host-side resolution model as + the FIDES scheme applies: the wire marker is advisory; the regulated semantics + are host-resolved against the deployment's policy. + +## Open questions + +- Is the `class` set fixed (`public/internal/confidential/restricted`) or + registry-curated? @localden's removability concern argues against fixing it. +- Do `regulatory` and `labels` overlap enough to collapse into one open-string + list, or are "named regime" and "org-defined tag" usefully distinct? +- Should `data-class.v1` ever be expressible **on the wire** as a structured + escape hatch (tracked in [open-questions.md](../docs/open-questions.md)), + or is keeping it strictly behind `evidenceRef` the right boundary? +- How does this relate to SEP-2061's `DataClass` / `RegulatoryScope` — is this + scheme the migration target for that work, or a parallel encoding? + +## Changelog + +- **2026-06-17** — Initial skeleton. Payload shapes are candidates pending the + open questions above. From daf51770fecdd9bdb0bb0d3986b788935430656a Mon Sep 17 00:00:00 2001 From: Sam Morrow Date: Tue, 16 Jun 2026 10:07:34 +0200 Subject: [PATCH 3/3] Tie schemes to the 'sensitive' floor: refine it, emit both - schemes/README.md: note the wire vocabulary (sensitive) is the lowest-common- denominator floor and schemes refine it; producers are encouraged to emit both. - schemes/data-class.md: add an explicit emit-both guideline in graceful degradation; correct the changelog date 2026-06-17 -> 2026-06-16. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- schemes/README.md | 5 ++++- schemes/data-class.md | 6 ++++-- 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/schemes/README.md b/schemes/README.md index b870e71..962c715 100644 --- a/schemes/README.md +++ b/schemes/README.md @@ -4,7 +4,10 @@ A **scheme** is a concrete data-labelling or tool-annotation approach that fills the [`trust-annotations`](../specification/draft/trust-annotations.mdx) `evidenceRef` slot under an `evidenceRef.type` value. The extension defines a small, stable wire vocabulary and an open `type` pointer; a scheme defines the -richer, out-of-band record that pointer resolves to. +richer, out-of-band record that pointer resolves to. The wire vocabulary (notably +`sensitive`) is a lowest-common-denominator floor every client can act on; +schemes refine it for hosts that implement them, and a server that can classify +more precisely is encouraged to emit **both** the floor and a scheme record. Schemes are **not** extensions and **not** siblings of the extensions. They are interchangeable: a deployment can adopt one, several, or none, and can swap them diff --git a/schemes/data-class.md b/schemes/data-class.md index 74bf932..b59407c 100644 --- a/schemes/data-class.md +++ b/schemes/data-class.md @@ -94,7 +94,9 @@ The record is a JSON object. Shapes below are **open questions**, not settled: A client that does not implement `data-class.v1` ignores the record and relies on the `trust-annotations` `sensitive` boolean, which remains meaningful on its own. The scheme only ever *refines* the boolean; it never contradicts it (a -`data-class.v1` record only appears where `sensitive` is already `true`). +`data-class.v1` record only appears where `sensitive` is already `true`). Because +the boolean is the lowest-common-denominator floor, a producer of this scheme +SHOULD emit **both** the boolean and the record, never the record alone. ## Producer / consumer @@ -119,5 +121,5 @@ The scheme only ever *refines* the boolean; it never contradicts it (a ## Changelog -- **2026-06-17** — Initial skeleton. Payload shapes are candidates pending the +- **2026-06-16** — Initial skeleton. Payload shapes are candidates pending the open questions above.