feat: DRM and Constraint Catalog binding in ZeroID tokens (#59)#151
feat: DRM and Constraint Catalog binding in ZeroID tokens (#59)#151safayavatsal wants to merge 4 commits into
Conversation
…ai#59) - Migration 014: append-only decision_rights_matrix (DB trigger blocks UPDATE/DELETE), constraint_catalog_versions (Hash can repeat across rows on 24h re-sign), and drm_hash + constraint_catalog_hash columns on issued_credentials with partial indexes for drift detection. - Domain types: DRMDocument/DecisionRightsMatrix (+ ErrDRMUnauthorized and ErrDRMInvalid sentinels), ConstraintCatalogVersion, and SignalTypePolicyDrift. - GovernanceService: deterministic canonical-JSON hashing (sorted keys recursively), DRM authorization with exact and trailing-* SPIFFE pattern matching, catalog publish/re-sign with ES256 signing over sha256(hash||"|"||signed_at). policy_drift signals fan out from PublishDRM/PublishCatalog on hash change. - CatalogSignerWorker re-signs each tenant's active catalog every 24h (preserves Hash, rewrites SignedAt+Signature so outstanding tokens stay valid). - OAuthService.tokenExchange runs the DRM authorization gate before IssueCredential and embeds drm_version/drm_hash/ constraint_catalog_version/constraint_catalog_hash claims. authorization_code embeds the claims for audit but skips the gate (consent is not delegation). All four claim names added to reservedClaims; Introspect passes them through. - Admin endpoints under /governance/decision-rights-matrix and /governance/constraint-catalog. Routes self-suppress when governanceSvc is nil. Every code path is no-op for tenants that haven't published a DRM/catalog row -- pre-highflame-ai#59 flows are unchanged. - Integration tests: happy path with claim assertions, DRM-deny path, and no-config backward-compat path. Each test uses a fresh tenant to keep the append-only DRM rows from leaking across the run.
There was a problem hiding this comment.
Code Review
This pull request introduces a governance framework featuring a Decision-Rights Matrix (DRM) and a Constraint Catalog to bind policy snapshots to issued credentials. Key additions include the GovernanceService for managing these artifacts, new API endpoints for publishing and retrieving them, and a background worker for periodic catalog re-signing. Feedback highlights several critical improvements: the emitDriftSignals function should perform asynchronous fan-out with pagination to prevent blocking API responses and OOM risks; the canonicalEncode logic can be simplified using standard library features; and database errors in PublishDRM and PublishCatalog must be handled to ensure reliable policy drift detection. Additionally, it is recommended that the CatalogSignerWorker perform an initial run at startup to prevent stale signatures in environments with frequent restarts.
Conflicts resolved across domain types, OAuth service, handler API surface, and server wiring. All conflicts were textual (both sides added orthogonal fields/code in adjacent regions); semantic intent of each side is preserved. - domain/credential.go: kept both DRMHash/ConstraintCatalogHash (highflame-ai#59) and MissionID (highflame-ai#81) fields. - domain/signal.go: kept both SignalTypePolicyDrift (highflame-ai#59) and SignalTypeIdentityExpired in const block and Valid() switch. - internal/service/credential.go: kept both governance binding fields (DRMVersion/DRMHash/ConstraintCatalogVer/ConstraintCatalogHash) and MissionID/CredentialExpiresAt fields on IssueRequest; merged IssuedCredential row literal accordingly. - internal/service/oauth.go: kept governanceSvc + backchannelSvc as separate fields. tokenExchange's resolveGovernance now uses jwx v4 delegatedBy var (Subject() returns (string, error) in v4) and threads MissionID. authorization_code adds IdentityPolicyID alongside governance hashes. Introspect passthrough uses jwt.Get[string] instead of v2 Get. - internal/handler/routes.go: combined attestationPolicySvc + backchannelSvc + governanceSvc fields and NewAPI params. Admin route registration adds registerGovernanceRoutes after registerBackchannelAdminRoutes/registerExpiringSoonRoute. - server.go: kept both backchannelSvc wiring (CIBA two-phase) and governanceSvc wiring; cleanupWorker now takes backchannelRepo per upstream signature; handler.NewAPI call updated; gofmt applied to the merged struct literal. - Migration renumber: 014_governance_artifacts -> 023_governance_artifacts to avoid prefix collision with upstream's 014_attestation_policies. Verified clean: GOEXPERIMENT=jsonv2 go build, go vet, gofmt -l, governance integration tests (3/3 pass), full integration suite (green, ~9s with the upstream test optimisations).
Five comments resolved in priority order: (high, #3258137060) emitDriftSignals fan-out now runs on a detached goroutine parented on a new GovernanceService.svcCtx so a hash transition affecting many identities does not block the admin POST. Server.Shutdown calls GovernanceService.Stop() to cancel in-flight fan-outs (mirrors the BackchannelService pattern). The affected- identity scan is paginated via the new CredentialRepository.ListIdentitiesByGovernanceHashPage (keyset cursor on identity_id ASC, default page size 500) so a single drift event cannot OOM the worker on huge tenants. (medium, #3258137067) canonicalJSON drops the hand-rolled recursive encoder. encoding/json.Marshal sorts string-keyed map keys, so the two-pass Marshal->Unmarshal-into-any->Marshal pattern produces the same canonical output via stdlib alone. Hashing tests still pass — the determinism contract is preserved. (medium, #3258137081, #3258137088) PublishDRM and PublishCatalog now log the GetActive lookup error instead of swallowing it. The failure is still non-fatal (we don't want to fail the write because the drift lookup couldn't reach the DB) but is no longer silent. (medium, #3258137107) CatalogSignerWorker performs an initial run at startup before entering the 24h tick loop so a server restarted more often than the re-sign interval still produces fresh signed_at rows. Verified clean: GOEXPERIMENT=jsonv2 go build, go vet, gofmt -l, governance integration tests (3/3 pass), full integration suite green.
Resolved conflicts: - internal/handler/routes.go: kept both registerGovernanceRoutes (from PR) and registerSigningCredentialRoutes (from upstream) - server.go: kept both governanceSvc and signingCredSvc parameters in NewAPI call This integrates the signing credential feature (highflame-ai#150) with the governance binding feature (highflame-ai#59). Co-authored-by: Cursor <cursoragent@cursor.com>
Merge Conflicts ResolvedSuccessfully merged Conflicts Resolved:
New Features Integrated from upstream/main:
The merge conflict resolution is complete and the branch is now up-to-date with main. cc: @rsharath |
Closes #59.
Summary
token_exchange(RFC 8693) now refuses delegations not permitted by the active DRM;authorization_codeembeds the hashes for audit but does not gate (consent is not delegation).Changes
Schema (migration 014)
decision_rights_matrixtable -- append-only viadrm_block_mutationtrigger; new versions = new rows.constraint_catalog_versionstable -- multiple rows can sharehashwhen the 24h re-sign produces an unchanged document (onlysigned_at/signaturediffer).issued_credentials.drm_hashandconstraint_catalog_hashcolumns + partial indexes scoped to non-revoked, non-expired credentials forpolicy_driftfan-out.Domain (
domain/)DRMDocument,DRMAllowedDelegation,DecisionRightsMatrix.ErrDRMUnauthorized,ErrDRMInvalidsentinels (wrapped with%wso callers useerrors.Is).ConstraintCatalogVersionwithjson.RawMessagedocument (opaque -- ZeroID hashes and signs, does not parse).SignalTypePolicyDriftadded to theSignalTypeenum andValid()switch.IssuedCredential.Service (
internal/service/governance.go)HashSHA256-- deterministic canonical-JSON via recursive sorted-key encoding; identical documents always hash identically.AuthorizeDelegation-- exact match plus single trailing-*SPIFFE pattern (predictable failure modes; no third-party glob library).PublishDRM/PublishCatalog/ResignCatalog-- catalog ES256-signssha256(hash || "|" || signed_at)via the existingjwksSvc.emitDriftSignals-- best-effort fan-out ofpolicy_driftsignals on hash transition.Worker (
internal/worker/catalog_signer.go)Hash, rewritesSignedAt/Signature. Decoupled from the service package via aCatalogResignerinterface.OAuth wiring (
internal/service/oauth.go)tokenExchangerunsresolveGovernanceafter scope-intersection and beforeIssueCredential; rejection wrapsErrDRMUnauthorizedasinvalid_grant.authorization_codeembeds the four governance claims but skips the authorization gate.drm_version,drm_hash,constraint_catalog_version,constraint_catalog_hash) added toreservedClaimsso deployer enrichers cannot spoof them;IssueCredentialwrites them afterCustomClaimsas defence-in-depth.Introspectpasses the four claims through when present.Admin API (
internal/handler/governance.go)POST/GET/list /governance/decision-rights-matrixPOST/GET /governance/constraint-catalog/activeregisterGovernanceRouteswhengovernanceSvcis nil.Tests (
tests/integration/governance_test.go)invalid_grant).X-Account-ID/X-Project-IDheaders because the DRM table is append-only (cannott.Cleanupdelete).Test plan
go build ./...cleango vet ./...cleangofmt -lcleanAcceptance criteria
domain/decision_rights_matrix.go)PublishCatalog)drm_hashandconstraint_catalog_hashclaims added to delegation token issuancetokenExchange)policy_driftCAE signal implemented (SignalTypePolicyDrift+emitDriftSignals)internal/handler/governance.go; downstream Credential Policies guide can land separately)Notes for reviewers
authorization_codedeliberately skips the DRM gate. The issue says "Token issuance MUST fail if the requested delegation is not authorized by the current DRM"; my reading is that human consent is not a delegation in the SPIFFE-pair sense (nofrom -> toURI pair exists), so the hash binding is the right answer there but the gate is not. Happy to revisit if you read it differently.main; the issue references "Cedar policy evaluation in Shield" which I read as downstream. The Constraint Catalog is therefore an opaque blob ZeroID hashes and signs but does not parse, matching the "machine-readable, versioned policy document" wording.from/toSPIFFE pattern matching supports only exact match and a single trailing*. Full glob is over-engineering for an operator-authored DRM; predictable failure modes matter more than expressiveness.