Portable, always-fresh docs for agents and humans.
Agents read your docs on every run, and they can't tell a current doc from a rotted one. Surface closes that gap with two open layers: OKF (Google's vendor-neutral format) makes docs portable and accessible, and Surface governs their freshness - you anchor a sentence to the code it describes, and the build fails until a human re-confirms it whenever that code's logic changes. Documentation, governed like code. Portable and fresh docs measurably improve how well agents perform.
Deterministic. No model, no network, no API key in the core.
Docs: surface.gradientdev.xyz · Install: docs/getting-started/install.md
You write a context file for your codebase - an architecture note, an AGENTS.md, a hub for the
auth flow. The day you write it, it's accurate.
Then the code moves. Someone refactors the function you described; the behavior changes on purpose, the tests get updated, CI goes green, the PR merges. Everything is correct - except the paragraph that described that function. Nobody touched it, for two ordinary reasons: they didn't know it existed, and there was no standard place to look. It now says something untrue.
Nothing failed. Nothing fired. The only thing that broke is the explanation the next engineer - and every agent on every run - will trust and reason from. A codebase can be fully green on tests while its docs quietly describe code that no longer works the way they say, and nothing in your toolchain catches the gap.
Surface closes that gap two ways: hubs/ give documentation a standard home so people and
agents actually find it, and surf check governs the prose like a test so it can't silently
rot.
You anchor a sentence to the code it's about:
# hubs/auth.md (a "hub" - frontmatter + prose, lives wherever you like)
anchors:
- claim: "refresh rotation is single-use; reuse triggers global logout"
at: "src/auth/refresh.ts > rotateRefreshToken"
hash: 9b1c33asurf check runs in CI. For each anchor it finds the symbol, reduces it to pure logic (ignoring
formatting, comments, and renames), fingerprints that, and compares it to the fingerprint stored
the last time a human confirmed the sentence was true.
- Matches → the logic didn't meaningfully change → silent pass.
- Differs → the code diverged from its description → block the merge with a precise report: which hub, which claim, old code vs. new code.
Quiet on cosmetics, loud on logic. Reformatting, comments, and consistent renames don't fire; a
flipped operator, a relaxed comparison, or a dropped await does. The full mechanism is in
How the gate works.
A hub is a conformant Open Knowledge Format concept - Google's vendor-neutral
standard for knowledge as markdown + frontmatter. OKF standardizes how knowledge is written down but
deliberately omits freshness; that's exactly what Surface adds. Surface = OKF + the freshness OKF
leaves out. Your hubs drop into any OKF consumer (Knowledge Catalog, the OKF visualizer, Obsidian,
git-backed doc editors), which read the prose and ignore the anchors: Surface governs. See
Surface and OKF.
Fresh, portable docs measurably improve how well agents perform.
An agent reads your docs on every run and reasons from them as if they were true. We measured what that's worth: across multiple models from three providers, agents given an accurate doc completed the task far more often than agents working from a stale one - a stale doc was even worse than no doc at all, and a more capable model was no more resistant. Just surfacing the drift recovered most of the gap. OKF makes the docs portable enough to reach every agent; Surface keeps them fresh enough to trust.
That's from a pre-registered, deterministically-graded benchmark
(3,250 graded completions, multi-turn agents, no LLM judge). It measures drift of exactly the kind
surf check catches; the author of the benchmark also authors Surface, and a null result on any
hypothesis was reportable - see the write-up
for the full method, limitations, and data.
surf init # writes surf.toml + creates hubs/
surf new auth # creates hubs/auth.md - add a claim and point at: at a symbol
surf lint # does every anchor resolve to exactly one symbol?
surf check # the gate - a new claim is "unverified" until you seal it
surf verify # you read the prose and confirmed it; seal the hashChange the logic of an anchored symbol and the gate blocks until someone re-reads the sentence:
$ surf check
DIVERGED hubs/auth.md :: src/auth/refresh.ts > rotateRefreshToken
stored 9b1c33ade8f1 → now 4d5e6f2a0b7c (magnitude: Small)
claim: refresh rotation is single-use; reuse triggers global logout
surf check: 1 divergence(s).
If the prose still holds, surf verify re-seals it; if it's now false, fix the prose first. Full
walkthrough: Quickstart.
Read this part. It's the difference between a tool you trust and one that burns you.
- It does not tell you your docs are true. It tells you the code they point at changed, so a human should re-read the prose. A green check means "nothing drifted since the last sign-off," not "everything is correct."
- It only watches what you anchored. A change in a file no hub points at can still invalidate a documented invariant; Surface won't see it. That's security review and taint analysis - a different discipline.
- It is not a retrieval system. It doesn't search, embed, or serve context. It optimizes a different thing: trust in what you retrieve.
The fuzzy "is this claim still true" judgment lives in an optional reviewer plugin that reads Surface's JSON output. The core never depends on it. More in What Surface does NOT do and Is Surface for you?.
Most repos never install the binary - they run the GitHub Action:
# .github/workflows/surface.yml
- uses: actions/checkout@v4 # plain checkout - do NOT set fetch-depth: 0
- uses: Connorrmcd6/surface@v0.8.0Or the install script:
curl --proto '=https' --tlsv1.2 -fsSL https://raw.githubusercontent.com/Connorrmcd6/surface/main/install.sh | shPrebuilt binaries for macOS (Apple Silicon) and Linux (x86_64); build from source on other
Unix arches. Windows is not supported (the anchor grammar requires forward-slash paths).
Full options - pre-commit hook, cargo install, the architecture matrix - in
Install.
Full docs at surface.gradientdev.xyz (source in
docs/):
- Quickstart · Install
- Authoring hubs - claims, anchor grammar, granularity, the verify loop.
- Surface and OKF - hubs as conformant Open Knowledge Format concepts.
- CI integration - the Action, the pre-commit hook, scoping a PR.
- Examples - a minimal hub in each supported language.
- Reference: Commands · Configuration · How the gate works · FAQ
Release history is in CHANGELOG.md. AI agents working in this repo: see
AGENTS.md.
The agent-impact benchmark - measuring how much documentation accuracy changes an agent's task
performance - lives in its own repo: Connorrmcd6/surface-bench.
It consumes the surf binary's output but has no inbound dependency on this core. (It previously
lived under bench/ here.)
The naming isn't decoration: the gradient of a field is everywhere perpendicular to its level surfaces - the direction of change, and the thing the change is measured against. Surface reports divergence between what your docs claim and what your code does.