Skip to content

docs(rfc-003): canonical MCP implementation blueprint (omnigraph-mcp crate)#182

Open
ragnorc wants to merge 2 commits into
mainfrom
ragnorc/omnigraph-mcp-crate
Open

docs(rfc-003): canonical MCP implementation blueprint (omnigraph-mcp crate)#182
ragnorc wants to merge 2 commits into
mainfrom
ragnorc/omnigraph-mcp-crate

Conversation

@ragnorc

@ragnorc ragnorc commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary

Updates RFC-003 into the authoritative, build-from-scratch spec for the in-server MCP surface, with a dedicated omnigraph-mcp crate from the start. Docs-only (one file); no code.

A reference implementation shipped in #157 (it proved rmcp 1.7 integrates on edition 2024, the auth-extension passthrough works, the conformance MUSTs are met, and the surface splits cleanly into a transport crate + a server backend). This RFC folds those learnings into a clean spec so the surface can be implemented properly from main with the crate baked in — not retrofitted.

What's in the new "Implementation Blueprint" (B.1–B.13)

A self-contained, ordered build guide:

  • B.1 Crate architecture + dependency rulecrates/omnigraph-mcp (rmcp Streamable-HTTP transport + an McpBackend trait + rmcp re-exports); omnigraph-server depends on it and implements the trait. The dependency must go server → mcp (a mcp → server dep cycles at the package level: server-bin → mcp → server-lib). The trait inverts the direction, so the crate never names an omnigraph type.
  • B.2–B.3 Seam + transport — the &http::request::Parts passthrough (how the server-injected ResolvedActor / GraphHandle reach the backend), and the verified rmcp config that gives GET→405, Origin→403, and MCP-Protocol-Version validation for free.
  • B.4 Backend reuse — factor 10 thin do_* fns; reuse run_query/run_mutate/authorize/api::query_catalog_entry/ApiError.
  • B.5–B.9 — the 13-tool catalog + Cedar mapping, ParamKind → JSON Schema, deny-masking + isError split, stored-query double-gate, two resources.
  • B.10 Routing (resolved) — per-graph /mcp; graphs_list / omnigraph://graphs dropped from MCP (graph discovery stays REST-only via GET /graphs); explicitly rules out a single flat /mcp with graph_id-in-call (breaks per-graph stored-query listing + isolation).
  • B.11 Auth — the decoupling invariant + a client-compatibility matrix (static bearer works for Claude Code / Cursor / VS Code / the API connector; claude.ai web + ChatGPT need the OAuth/RFC-9728 fast-follow).
  • B.12–B.13 — test plan + verification (cargo tree | grep rmcp confirms rmcp is not a direct server dep), and the locked decisions (resolves the Open Questions).

Housekeeping in the same edit

  • Marked the old "Reconciliation with shipped code" section as pre-MCP history.
  • Marked the §5 sketches as design rationale (superseded by the Blueprint where they differ — e.g. the per-tool McpTool trait became a Builtin enum + the McpBackend crate trait).
  • Resolved the Open Questions inline.

Relationship to #157

#157 is the working reference implementation (in-server MCP without the crate split). This RFC is the spec for the clean version; the implementation will be built on a fresh branch from main following B.1–B.13, cribbing from #157.


Note

Low Risk
Docs-only change to a single RFC file; no runtime code or configuration is modified.

Overview
RFC-003 moves from Proposed to the canonical spec for a clean in-server MCP reimplementation from main, citing #157 as the reference proof and pointing implementers at the new Implementation Blueprint (B.1–B.13) instead of the original §5 sketches.

The blueprint adds an ordered build guide: split crates/omnigraph-mcp (rmcp Streamable HTTP + McpBackend + re-exports) from omnigraph-server (implements the trait; server → mcp only), &http::request::Parts extension passthrough for auth/graph context, deployment-derived Host/Origin policy (not rmcp’s loopback default), do_* refactor + reuse of existing invoke/authorize/catalog paths, 13 built-in tools + Cedar mapping, stored-query JSON Schema and list/call deny-masking, per-graph /mcp with graphs_list / omnigraph://graphs dropped from MCP, auth client matrix, and B.12 tests/verification.

Housekeeping in the same doc: Summary and stdio-package relationship text updated (v0.6.1 / 16 tools vs in-server 13); Reconciliation labeled pre-MCP history; §5, Testing, Rollout, and Open Questions marked superseded or resolved via B.13 (per-query invoke_query and OAuth remain deferred).

Reviewed by Cursor Bugbot for commit 2ade97f. Bugbot is set up for automated code reviews on this repo. Configure here.

Greptile Summary

Promotes RFC-003 from "Proposed" to the canonical, authoritative build spec for a clean omnigraph-mcp crate implementation, folding in every verified decision from the reference work in #157. One concrete spec inconsistency was found.

  • Implementation Blueprint (B.1–B.13) is the bulk of the addition: it specifies the crate split (omnigraph-mcp transport + omnigraph-server backend), the McpBackend trait seam using &http::request::Parts for auth passthrough, host/origin policy derived from deployment context, the 13-tool Cedar-mapped catalog, stored-query projection, classify error mapping, and a full test/verification plan.
  • Housekeeping: the old Reconciliation section is marked historical, §5 is reframed as superseded design rationale, Open Questions are resolved inline pointing to B.13, and the stdio package figures are refreshed to v0.6.1 / 16 tools.
  • Spec inconsistency: B.4's mcp_router call site passes only two arguments while B.3 defines the function with three (backend, body_limit, hosts: McpHostPolicy); following B.4 literally would produce a compile error.

Confidence Score: 4/5

Documentation-only change; safe to merge with the B.4 call-site inconsistency corrected before the implementation branch begins.

The only changed file is an RFC markdown document with no runtime impact. The spec is internally consistent except for one call site in B.4 that omits the required hosts: McpHostPolicy argument that B.3 defines — an implementer following that example verbatim would hit a compile error at the first build. No code, migrations, or configs are modified.

docs/dev/rfc-003-mcp-server-surface.md — the B.4 mcp_router call site needs the hosts argument added before implementation begins.

Important Files Changed

Filename Overview
docs/dev/rfc-003-mcp-server-surface.md Comprehensive implementation blueprint (B.1–B.13) added for the omnigraph-mcp crate; one spec inconsistency: B.4's example call to mcp_router omits the required hosts: McpHostPolicy third argument defined in B.3

Sequence Diagram

sequenceDiagram
    participant Client as MCP Client
    participant Middleware as Auth Middleware
    participant McpCrate as omnigraph-mcp McpService
    participant Backend as OmnigraphMcpBackend
    participant Cedar as Cedar Policy Engine
    participant Engine as Graph Engine

    Client->>Middleware: "POST /graphs/{id}/mcp Authorization: Bearer"
    Middleware->>McpCrate: "Extensions: ResolvedActor + Arc<GraphHandle>"
    McpCrate->>Backend: call_tool(parts, name, args)
    Backend->>Cedar: authorize(PolicyAction, actor, branch)
    Cedar-->>Backend: "Allowed | Denied"
    alt Denied
        Backend-->>McpCrate: -32602 unknown tool
    else Allowed
        Backend->>Engine: "do_* / run_query"
        Engine-->>Backend: "Ok(result) | Err(ApiError)"
        Backend->>Backend: classify(result)
        Backend-->>McpCrate: "CallToolResult | McpError"
    end
    McpCrate-->>Client: JSON-RPC response
Loading

Fix All in Claude Code

Reviews (2): Last reviewed commit: "docs(rfc-003): correct-by-construction f..." | Re-trigger Greptile

Greptile also left 1 inline comment on this PR.

…crate)

Brings RFC-003 to the canonical from-scratch spec for the in-server MCP surface:
a dedicated omnigraph-mcp transport crate + McpBackend trait (server -> mcp dep
to avoid a Cargo cycle), the verified rmcp 1.7 design (conformance MUSTs for
free), backend reuse (do_* / run_query / run_mutate / authorize / api), the
13-tool catalog + Cedar mapping, ParamKind -> JSON Schema, deny-masking +
isError split, two resources, per-graph routing (graphs_list stays REST-only),
auth + client-compat matrix + OAuth fast-follow, tests, and locked decisions.

Build the implementation on this branch from blueprint sections B.1-B.13. The
spec is informed by the #157 reference implementation.
@ragnorc ragnorc requested a review from aaltshuler as a code owner June 10, 2026 19:51

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3009564437

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread docs/dev/rfc-003-mcp-server-surface.md Outdated

- A generic `struct McpService<B>` implements rmcp's `ServerHandler`, delegating each method to `B` after extracting `&Parts` from `ctx.extensions` once (missing → `McpError::internal_error`). `get_info → backend.server_info()`; `initialize`/`ping` use rmcp's defaults.
- `pub fn mcp_router<B: McpBackend>(backend: B, body_limit: usize) -> axum::Router`:
- `let svc = StreamableHttpService::new(move || Ok(McpService::new(backend.clone())), Arc::new(LocalSessionManager::default()), config)` with `config = StreamableHttpServerConfig::default().with_stateful_mode(false).with_json_response(true)` (`StreamableHttpServerConfig` is `#[non_exhaustive]` — build from `Default`, mutate via the `with_*` setters; keep the loopback `allowed_hosts` default).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Configure allowed hosts for non-loopback MCP deployments

In any remote or cloud deployment, keeping rmcp's default allowed_hosts makes /mcp reject valid clients before bearer auth, because rmcp 1.7 defaults Host validation to loopback authorities only (localhost, 127.0.0.1, ::1). This RFC explicitly targets remotely reachable MCP clients, so the blueprint should require a server-configured allowed-host allowlist (or a documented reverse-proxy Host validation setup) instead of preserving the loopback default.

Useful? React with 👍 / 👎.

Comment on lines +157 to +159
- **Ad-hoc `query`/`mutate` always exposed, Cedar-only**; no `mcp.allow_adhoc`. [Open Q3 → resolved.]
- **`query`/`mutate` ids only**, no `read`/`change` aliases. [Open Q7 → resolved.]
- **Per-graph `/mcp` routing**; `graphs_list`/`omnigraph://graphs` **dropped from MCP** (graph discovery via REST `GET /graphs`); no server-scoped MCP tools. [Open Q5 → resolved.]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reconcile the stale rollout with locked decisions

These locked decisions conflict with the still-active Rollout and Testing sections below, which continue to tell implementers to ship graphs_list, read/change aliases, a server-level /mcp, and an mcp.allow_adhoc switch. Since the status says the blueprint only supersedes the older §5 sketches, someone following the Rollout section would build surface area this canonical section says was dropped; please update those later sections or explicitly mark them historical too.

Useful? React with 👍 / 👎.

Comment thread docs/dev/rfc-003-mcp-server-surface.md Outdated
Comment thread docs/dev/rfc-003-mcp-server-surface.md Outdated
Fold the valid #157 review findings into the blueprint as by-design fixes so a
from-scratch build cannot reintroduce them:

- B.3.1: host/origin policy derived from the server bind + config (loopback ->
  loopback hosts; non-loopback -> configured public host, else Host-allowlist
  off with bearer + Origin as the controls). Closes the loopback-only-default
  footgun that 403s every remote client before bearer auth.
- B.6/B.7: stored-query params nested under a `params` object (mirrors
  POST /queries/{name}), so the branch/snapshot knobs cannot collide with a
  query param; mutation tools omit `snapshot` + additionalProperties:false, so
  mutation-against-a-snapshot is unrepresentable. Vector `dim` made intrinsic to
  the kind, killing the unwrap_or(0) zero-length-array schema.
- B.8: one canonical classify(Result<_, ApiError>) for tool errors (5xx ->
  JSON-RPC, 4xx/409 -> isError), no second mapper to drift; list visibility uses
  the call-path default-branch authorization, not a branch:None probe that hid
  branch-scoped-grant tools.
- Reconciled the stale Summary count and the section-5-era Rollout/Testing
  sections (banners pointing to B.12/B.4/B.13).
@ragnorc

ragnorc commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

Addressed the valid review findings as correct-by-construction fixes (commit 2ade97f) so a from-scratch build can't reintroduce them:

The "missing server.md MCP section" finding was stale — reviewers saw commit 94bbf67, before the docs commit added it.

**Reuse, never reinvent.** First factor **10 thin `do_*` fns** out of the inline `server_*` HTTP handlers (each is `authorize_request(...) → engine call → DTO`) so REST and MCP dispatch one path: `do_snapshot`, `do_schema_get`, `do_branches_list`, `do_commits_list`, `do_commit_show`, `do_ingest`, `do_branch_create`, `do_branch_delete`, `do_branch_merge`, `do_schema_apply`. (No `do_graphs_list` — `graphs_list` is not an MCP tool, see B.10; `server_graphs_list` stays inline for `GET /graphs`.) Land that as a behavior-neutral refactor commit first (it keeps the REST handlers as thin wrappers; all server tests stay green). Then reuse as-is: `run_query` / `run_mutate` (already decoupled from request bodies), `authorize` → `Authz { Allowed, Denied(msg) }` (with `Err` reserved for operational 401/500), `api::query_catalog_entry` / `ParamKind` / `read_output`, `ApiError` (add `pub(crate) status_code()` + `message_str()` accessors for the error classifier). Mount in `build_app`: `.merge(mcp::mcp_router(state))` inside the `per_graph_protected` group, where the server's thin `mcp::mcp_router(state) = omnigraph_mcp::mcp_router(OmnigraphMcpBackend::new(state), INGEST_REQUEST_BODY_LIMIT_BYTES)`.

Represent the built-ins as a `Builtin` enum (one variant per tool; `descriptor` / `gate` / `call` as match arms) — lower liability than ~14 unit structs + `dyn` + `async-trait` per tool. Stored-query tools are a sibling populator over `handle.queries`.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 B.4 call site missing hosts: McpHostPolicy argument

The mcp_router function defined in B.3 has the signature pub fn mcp_router<B: McpBackend>(backend: B, body_limit: usize, hosts: McpHostPolicy) -> axum::Router (three parameters), but the mounting example in B.4 only passes two: omnigraph_mcp::mcp_router(OmnigraphMcpBackend::new(state), INGEST_REQUEST_BODY_LIMIT_BYTES). The hosts: McpHostPolicy argument is absent. An implementer following B.4 verbatim would get a Rust compile error at that call site. The call should be updated to pass the McpHostPolicy computed at startup per B.3.1 — for example omnigraph_mcp::mcp_router(OmnigraphMcpBackend::new(state), INGEST_REQUEST_BODY_LIMIT_BYTES, McpHostPolicy::from_bind_config(&state.config)).

Fix in Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant