Skip to content

Connorrmcd6/surface

Repository files navigation

Surface

Portable, always-fresh docs for agents and humans.

Agents read your docs on every run, and they can't tell a current doc from a rotted one. Surface closes that gap with two open layers: OKF (Google's vendor-neutral format) makes docs portable and accessible, and Surface governs their freshness - you anchor a sentence to the code it describes, and the build fails until a human re-confirms it whenever that code's logic changes. Documentation, governed like code. Portable and fresh docs measurably improve how well agents perform.

Deterministic. No model, no network, no API key in the core.

Docs: surface.gradientdev.xyz · Install: docs/getting-started/install.md


The problem

You write a context file for your codebase - an architecture note, an AGENTS.md, a hub for the auth flow. The day you write it, it's accurate.

Then the code moves. Someone refactors the function you described; the behavior changes on purpose, the tests get updated, CI goes green, the PR merges. Everything is correct - except the paragraph that described that function. Nobody touched it, for two ordinary reasons: they didn't know it existed, and there was no standard place to look. It now says something untrue.

Nothing failed. Nothing fired. The only thing that broke is the explanation the next engineer - and every agent on every run - will trust and reason from. A codebase can be fully green on tests while its docs quietly describe code that no longer works the way they say, and nothing in your toolchain catches the gap.

Surface closes that gap two ways: hubs/ give documentation a standard home so people and agents actually find it, and surf check governs the prose like a test so it can't silently rot.

How it works, in one breath

You anchor a sentence to the code it's about:

# hubs/auth.md  (a "hub" - frontmatter + prose, lives wherever you like)
anchors:
  - claim: "refresh rotation is single-use; reuse triggers global logout"
    at: "src/auth/refresh.ts > rotateRefreshToken"
    hash: 9b1c33a

surf check runs in CI. For each anchor it finds the symbol, reduces it to pure logic (ignoring formatting, comments, and renames), fingerprints that, and compares it to the fingerprint stored the last time a human confirmed the sentence was true.

  • Matches → the logic didn't meaningfully change → silent pass.
  • Differs → the code diverged from its description → block the merge with a precise report: which hub, which claim, old code vs. new code.

Quiet on cosmetics, loud on logic. Reformatting, comments, and consistent renames don't fire; a flipped operator, a relaxed comparison, or a dropped await does. The full mechanism is in How the gate works.

Speaks OKF

A hub is a conformant Open Knowledge Format concept - Google's vendor-neutral standard for knowledge as markdown + frontmatter. OKF standardizes how knowledge is written down but deliberately omits freshness; that's exactly what Surface adds. Surface = OKF + the freshness OKF leaves out. Your hubs drop into any OKF consumer (Knowledge Catalog, the OKF visualizer, Obsidian, git-backed doc editors), which read the prose and ignore the anchors: Surface governs. See Surface and OKF.

Why it matters

Fresh, portable docs measurably improve how well agents perform.

An agent reads your docs on every run and reasons from them as if they were true. We measured what that's worth: across multiple models from three providers, agents given an accurate doc completed the task far more often than agents working from a stale one - a stale doc was even worse than no doc at all, and a more capable model was no more resistant. Just surfacing the drift recovered most of the gap. OKF makes the docs portable enough to reach every agent; Surface keeps them fresh enough to trust.

That's from a pre-registered, deterministically-graded benchmark (3,250 graded completions, multi-turn agents, no LLM judge). It measures drift of exactly the kind surf check catches; the author of the benchmark also authors Surface, and a null result on any hypothesis was reportable - see the write-up for the full method, limitations, and data.

Quickstart

surf init              # writes surf.toml + creates hubs/
surf new auth          # creates hubs/auth.md - add a claim and point at: at a symbol
surf lint              # does every anchor resolve to exactly one symbol?
surf check             # the gate - a new claim is "unverified" until you seal it
surf verify            # you read the prose and confirmed it; seal the hash

Change the logic of an anchored symbol and the gate blocks until someone re-reads the sentence:

$ surf check
DIVERGED  hubs/auth.md :: src/auth/refresh.ts > rotateRefreshToken
    stored 9b1c33ade8f1 → now 4d5e6f2a0b7c  (magnitude: Small)
    claim: refresh rotation is single-use; reuse triggers global logout
surf check: 1 divergence(s).

If the prose still holds, surf verify re-seals it; if it's now false, fix the prose first. Full walkthrough: Quickstart.

What Surface does NOT do

Read this part. It's the difference between a tool you trust and one that burns you.

  • It does not tell you your docs are true. It tells you the code they point at changed, so a human should re-read the prose. A green check means "nothing drifted since the last sign-off," not "everything is correct."
  • It only watches what you anchored. A change in a file no hub points at can still invalidate a documented invariant; Surface won't see it. That's security review and taint analysis - a different discipline.
  • It is not a retrieval system. It doesn't search, embed, or serve context. It optimizes a different thing: trust in what you retrieve.

The fuzzy "is this claim still true" judgment lives in an optional reviewer plugin that reads Surface's JSON output. The core never depends on it. More in What Surface does NOT do and Is Surface for you?.

Install

Most repos never install the binary - they run the GitHub Action:

# .github/workflows/surface.yml
- uses: actions/checkout@v4   # plain checkout - do NOT set fetch-depth: 0
- uses: Connorrmcd6/surface@v0.8.0

Or the install script:

curl --proto '=https' --tlsv1.2 -fsSL https://raw.githubusercontent.com/Connorrmcd6/surface/main/install.sh | sh

Prebuilt binaries for macOS (Apple Silicon) and Linux (x86_64); build from source on other Unix arches. Windows is not supported (the anchor grammar requires forward-slash paths). Full options - pre-commit hook, cargo install, the architecture matrix - in Install.

Documentation

Full docs at surface.gradientdev.xyz (source in docs/):

Release history is in CHANGELOG.md. AI agents working in this repo: see AGENTS.md.

Benchmark

The agent-impact benchmark - measuring how much documentation accuracy changes an agent's task performance - lives in its own repo: Connorrmcd6/surface-bench. It consumes the surf binary's output but has no inbound dependency on this core. (It previously lived under bench/ here.)


The naming isn't decoration: the gradient of a field is everywhere perpendicular to its level surfaces - the direction of change, and the thing the change is measured against. Surface reports divergence between what your docs claim and what your code does.

About

Portable, always-fresh docs for agents and humans. A Surface hub is a conformant OKF concept; Surface adds the freshness OKF leaves out - it fails the build when the code a doc describes changes. Deterministic, no LLM.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors