feat(curtailment): admin RPC, override fields, and session-only auth#173
Conversation
Adds the proto contract surface for the BE-1.1 follow-up to issue #171: - AdminTransitionEvent RPC with AdminTransitionEventRequest/Response messages for the dead-reconciler operational runbook. target_state CEL is restricted to CANCELLED (=6) and FAILED (=7); COMPLETED states are intentionally excluded so the recovery RPC cannot misreport an event whose restore did not actually run. - StartCurtailmentRequest gains optional candidate_min_power_w_override (per-org default override) and bool allow_unbounded (skip max_duration_default_sec normalization). Both are admin-only. - PreviewCurtailmentPlanRequest gains optional candidate_min_power_w_override on the same field tag (26). - StopCurtailmentRequest gains optional restore_batch_size_override that takes precedence over any prior UpdateCurtailmentEvent value for the duration of restore. Generated Go and TypeScript outputs regenerated via just gen. Refs: #171, #116, #118 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds CategoryCurtailment to the EventCategory enum used by the activity log so curtailment domain code in subsequent tickets (BE-3 dispatch, BE-5 read APIs) can emit events under a dedicated category rather than overloading an existing one. Also lands a table-driven test covering all three Valid() switches in this file (EventCategory, ActorType, ResultType). The previous file had no Valid() coverage; pinning all three together prevents an "added a new enum value but forgot to extend Valid()" regression for future curtailment, schedule, or auth additions. Refs: #171 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lands the handler-layer enforcement for BE-1.1's admin recovery RPC and the three admin-gated optional override request fields, plus the matching test surface. requireAdminFromContext rejects non-admin roles AND API-key auth before returning Unimplemented from the stub. The API-key rejection hardens the override path on PreviewCurtailmentPlan, which is otherwise API-key-accessible — a leaked admin-owner API key cannot drive override-bearing Preview calls. Start/Stop/Update/AdminTransition are already covered at the interceptor layer (next commit), so the handler-level API-key check is defense-in-depth for those four. Test coverage: - TestHandler_AdminTransitionEventRoleGate: admin/super-admin reach Unimplemented; viewer + empty role return PermissionDenied. - TestHandler_AdminTransitionEventValidation: buf.validate constraints on event_uuid, target_state (CANCELLED/FAILED only), and reason. - TestHandler_OverrideFieldsRoleGate: 11 cases covering the matrix of (override field) × (admin-via-API-key | admin-via-session | viewer). API-key with admin role is rejected to prevent escape-hatch escalation. - TestHandler_NoOverrideSkipsRoleGate: Preview/Stop without overrides reach Unimplemented (preserves API-key-accessible reads). - TestHandler_NonAdminRPCsReturnUnimplemented: renamed from TestHandler_AllRPCsReturnUnimplemented since AdminTransitionEvent's Unimplemented body is covered by the role-gate test instead. - TestHandler_AdminTransitionEventRejectsMissingSession: covers the RPC invoked outside the authenticated request path. Refs: #171 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds StartCurtailment, StopCurtailment, UpdateCurtailmentEvent, and AdminTransitionEvent to SessionOnlyProcedures so a leaked API key cannot mass-stop a fleet, abort a live curtailment event, or force-recover a non-terminal event. The four read RPCs (PreviewCurtailmentPlan, GetActiveCurtailment, ListCurtailmentEvents) remain API-key-accessible for monitoring and dashboards. Three tests pin the contract: - TestCurtailmentWriteProceduresAreSessionOnly: list-membership assertion that the four write/admin RPCs are present. - TestCurtailmentReadProceduresStayApiKeyAccessible: list-membership assertion that the three read RPCs are NOT present. - TestAuthInterceptor_SessionOnlyRejectsApiKeyAuth: runtime test driving authenticate() with a Bearer header against each session-only curtailment procedure and asserting PermissionDenied. Closes the issue #171 acceptance criterion that called for runtime enforcement coverage rather than list-only assertions. Refs: #171 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds the remaining contract/auth hardening for curtailment v1 ahead of persistence/dispatcher work: a new admin recovery RPC, admin-gated override request fields, and session-only enforcement for curtailment write/admin procedures, plus an activity taxonomy update to support curtailment-domain events.
Changes:
- Added
AdminTransitionEventRPC to the curtailment proto (with validation restrictingtarget_statetoCANCELLED/FAILED) and updated generated Go/TypeScript outputs. - Enforced session-only auth for curtailment write/admin RPCs in the auth interceptor config, with tests pinning both membership and runtime behavior.
- Added handler-level admin gating for override-bearing requests and for
AdminTransitionEvent, plus tests for role/auth-method matrices and request validation.
Reviewed changes
Copilot reviewed 7 out of 10 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
server/internal/handlers/interceptors/config.go |
Adds curtailment write/admin procedures to SessionOnlyProcedures. |
server/internal/handlers/interceptors/config_test.go |
Tests session-only list membership, read-RPC non-membership, and runtime API-key rejection. |
server/internal/handlers/curtailment/handler.go |
Adds AdminTransitionEvent stub + central requireAdminFromContext gate for admin-only paths/override fields. |
server/internal/handlers/curtailment/handler_test.go |
Adds targeted tests for admin role gating, override-field gating, and AdminTransitionEvent validation. |
server/internal/domain/activity/models/models.go |
Introduces CategoryCurtailment and marks it valid. |
server/internal/domain/activity/models/models_test.go |
Table-driven tests covering Valid() for EventCategory, ActorType, and ResultType. |
proto/curtailment/v1/curtailment.proto |
Adds new RPC + override fields and validation constraints. |
server/generated/grpc/curtailment/v1/curtailmentv1connect/curtailment.connect.go |
Regenerated Connect stubs to include the new RPC (generated). |
client/src/protoFleet/api/generated/curtailment/v1/curtailment_pb.ts |
Regenerated TS protobuf outputs for new fields/RPC (generated). |
…covery Drops StartCurtailment, StopCurtailment, and UpdateCurtailmentEvent from SessionOnlyProcedures so external integrations can drive curtailment via the public API. AdminTransitionEvent stays session-only — the operator- of-last-resort recovery RPC must not be reachable via a long-lived bearer token. The handler-level role gate (requireAdminFromContext) on the override paths is preserved but no longer rejects API-key auth specifically. Override fields still require admin / super-admin role; the API key model is per-user with role inheritance, so an admin-role API key can drive override-bearing requests. Adding extra friction on the API-key auth method while plain Start / Stop calls remain API-key-accessible buys no real defense — a leaked admin key has full curtailment-write blast radius either way. Test changes: - TestCurtailmentWriteProceduresAreSessionOnly → TestCurtailmentAdminProcedureIsSessionOnly: only AdminTransitionEvent. - TestCurtailmentReadProceduresStayApiKeyAccessible → TestCurtailmentNonAdminProceduresStayApiKeyAccessible: now also asserts Start / Stop / Update are NOT in SessionOnlyProcedures. - TestAuthInterceptor_SessionOnlyRejectsApiKeyAuth → TestAuthInterceptor_AdminTransitionEventRejectsApiKeyAuth: scoped to the single remaining session-only curtailment procedure. - TestHandler_OverrideFieldsRoleGate: API-key cases inverted — admin via API key now reaches Unimplemented; viewer is rejected regardless of auth method. New "viewer + API key" case added for parity. Refs: #171 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
🔐 Codex Security Review
Review SummaryOverall Risk: MEDIUM Findings[MEDIUM]
|
…ethod matrix Two small simplifications surfaced by the post-API-access review: - Trim requireAdminFromContext doc to a one-line behavioral summary; the why-no-auth-method-check rationale lives in the prior commit message. - Add three missing TestHandler_OverrideFieldsRoleGate cases so viewer × API-key is exercised on Start (candidate + allow_unbounded) and Stop, matching the existing Preview coverage. Pins the contract that the override gate rejects viewer regardless of auth method. Refs: #171 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…vent Reviewer nit on #173: the RPC's target_state CEL is restricted to CANCELLED and FAILED, so the operational behavior is "force-end this event" rather than "transition through a state graph". Rename the RPC, its Request/Response messages, the action-verb error string, and the matching test function names so the public surface describes behavior rather than mechanism. Mechanical rename only; no logic change. Generated Go and TypeScript outputs regenerated via just gen. Refs: #171 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Add idempotency_key (= 4) to AdminTerminateEventRequest, mirroring StartCurtailmentRequest semantics so recovery RPC retries collapse. - Add buf.validate bounds on the admin override fields: candidate_min_power_w_override (gte: 1, lte: 10000000) and restore_batch_size_override (gte: 1, lte: 10000); zero-as-override is excluded since the existing convention is "zero means use default". - Cross-reference the paired AdminTerminateEvent role gates so neither side (SessionOnlyProcedures entry, handler-side requireAdminFromContext) is removed independently. - Remap missing session.Info from CodeInternal to CodeUnauthenticated in requireAdminFromContext so the response code reflects "no identity" rather than "server bug"; tests updated to match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-2 foundation) Lays the foundation for the BE-2 ticket (#140): the curtailment persistence layer, sqlc-backed store, domain models, the FIXED_KW mode implementation, and the enum-stability guard for the AdminTerminateEvent validator pinned in BE-1.x (#173). Migration 000040: - curtailment_event with full lifecycle columns plus the BE-1.x admin-only fields (allow_unbounded BOOLEAN, effective_batch_size INT). CHECK constraints enforce maintenance-pair consistency, non-empty external source/reference/idempotency_key, and non-empty reason. Partial UNIQUE indexes cover idempotency, webhook dedupe, and active-event lookup. - curtailment_target with composite PK (event_id, device_identifier), partial indexes for pending work and active-by-device schedule lookup. - curtailment_reconciler_heartbeat singleton seeded at migration time so the staleness alert always has a row to read. - curtailment_org_config with per-org tunables seeded one row per existing org in the same migration transaction; down migration drops the table. Domain layer: - server/internal/domain/curtailment/models defines the boundary shapes (Event, Target, OrgConfig, Heartbeat, EventState/TargetState typed wrappers) so selector/handler/modes do not import sqlc-generated code. - server/internal/domain/curtailment/modes ships the Mode interface and the FixedKw implementation. Pure logic — no I/O, no time, no shared state. Covers the three design-doc outcomes: target reached (overshoot bounded by last-added candidate), undershoot tolerated (only with explicit positive tolerance_kw), and insufficient curtailable load (with a structured InsufficientLoadDetail the handler can echo back). Store layer: - interfaces/curtailment.go defines the org-scoped CurtailmentStore; v1 surface is the minimum needed to support Preview plus the basic event/target CRUD primitives so store tests can verify the schema constraints round-trip. - sqlstores/curtailment.go implements the interface using the sqlc- generated queries (GetCurtailmentOrgConfig, ListActiveCurtailedDevicesByOrg, ListRecentlyResolvedCurtailedDevicesByOrg, InsertCurtailmentEvent, GetCurtailmentEventByUUID, InsertCurtailmentTarget, ListCurtailmentTargetsByEvent, GetCurtailmentReconcilerHeartbeat). BE-1.x guard: - TestCurtailmentEventStateNumericPins asserts CANCELLED == 6 and FAILED == 7 at build time. The AdminTerminateEventRequest validator pins on (buf.validate.field).enum.in: [6, 7]; this test fails CI before any future enum reorder can silently desynchronize the validator. Selector + handler implementation lands in follow-up commits on this branch. Refs #140 Refs #118 Refs #173 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Background
Curtailment is the proto-fleet feature that reduces a fleet's mining power on demand — operators can preview a plan, start a curtailment event for a selected scope of miners, verify the reduction via telemetry, and restore the miners safely afterwards. Use cases include grid-program participation, demand response, and operator-initiated power reduction.
The curtailment foundation (#116, merged in #118) shipped the v1 proto contract — six RPCs (
PreviewCurtailmentPlan,StartCurtailment,UpdateCurtailmentEvent,StopCurtailment,GetActiveCurtailment,ListCurtailmentEvents), the full enum surface (modes, strategies, levels, priorities, event/target states), capability flags, and Connect handler stubs that all returnUnimplemented. Persistence, the selector, the reconciler, the restorer, and the read-API logic land in subsequent tickets that fill in those stubs.This PR is a contract-layer follow-up to that foundation: a small set of additions to the same proto / handler / interceptor surface that need to be in place before the downstream tickets wire up bodies. Every new RPC stub still returns
Unimplemented; the value here is contract shape and auth scoping.Summary
Four scoped changes:
AdminTerminateEventRPC for the dead-reconciler operational runbook. Forces a non-terminal event to a terminal state.target_stateCEL is restricted toCANCELLED(=6) andFAILED(=7);COMPLETEDstates are intentionally rejected so the recovery RPC cannot misreport an event whose restore did not actually run. Carries anidempotency_keyfield so operator double-clicks during the runbook collapse to a single recovery action.buf.validatesanity ceilings:optional uint32 candidate_min_power_w_overrideonPreviewCurtailmentPlanRequestandStartCurtailmentRequest— per-org default override for the candidate-eligibility floor; bounded to[1, 10_000_000](10 MW per miner).bool allow_unboundedonStartCurtailmentRequest— explicit acknowledgement to skipmax_duration_default_secnormalization.optional uint32 restore_batch_size_overrideonStopCurtailmentRequest— takes precedence over any priorUpdateCurtailmentEventvalue for the duration of restore; bounded to[1, 10_000]. The "zero means use default" convention used elsewhere in the proto is preserved by the lower bound on each override.AdminTerminateEventonly. The recovery escape hatch must not be reachable via a long-lived bearer token.StartCurtailment/StopCurtailment/UpdateCurtailmentEventand the read RPCs remain API-key-accessible so external integrations can drive curtailment via the public API. The session-only listing and the handler-side admin gate are now cross-referenced in code comments as a paired invariant — neither alone is sufficient.CategoryCurtailmentadded to the activity event-category taxonomy so subsequent tickets can emit curtailment-domain events under a dedicated category.Also: when the handler-side admin gate sees no
session.Infoin context, it now returnsUnauthenticatedinstead of propagatingInternalfromsession.GetInfo. The auth interceptor should prevent this in production, but if interceptor wiring ever regresses, the response code reflects "no identity" rather than "server bug" — quieter alerts, no retry-storm encouragement.Security posture
PreviewCurtailmentPlanStartCurtailmentStopCurtailmentUpdateCurtailmentEventGetActiveCurtailmentListCurtailmentEventsAdminTerminateEventThe handler-level admin gate (
requireAdminFromContext) fires onAdminTerminateEventalways, and on Preview / Start / Stop when the corresponding override field is set on the request. The current API-key model is per-user with role inheritance, so an admin-role API key can drive override-bearing requests — matching how the apikey handler itself gates admin operations on role alone. A leaked admin API key already has full curtailment-write blast radius via plainStartCurtailment; restricting the override paths further would not close that gap.AdminTerminateEvent's session-only registration narrows the blast radius for the operator-of-last-resort recovery escape hatch specifically. A future scoped-API-key primitive would be the right place to relax it.What changed
Eight logical commits:
feat(curtailment): add admin RPC and admin-gated override fields— proto contract additions, regenerated Go + TypeScript outputs.feat(activity): add curtailment event category—CategoryCurtailmentenum entry plus a table-driven test covering all threeValid()switches inactivity/models.go(EventCategory,ActorType,ResultType).feat(curtailment): gate admin and override paths at the handler— singlerequireAdminFromContexthelper called fromAdminTerminateEvent(always) and from Preview / Start / Stop when the corresponding override field is set. Action-verb error messages are routed through two package-level constants (actionSupplyOverrideFields,actionTerminateEvents).feat(interceptors): register curtailment write RPCs as session-only— initial registration of all four write/admin procedures.feat(curtailment): allow API key access to write RPCs except admin recovery— narrows the previous commit so onlyAdminTerminateEventremains session-only. Test surface updated to match: invert the role-gate matrix to assert admin-via-API-key reachesUnimplementedon override paths, expand the "API-key-accessible" assertion list, scope the runtime SessionOnly interceptor test to the single remaining session-only procedure.refactor(curtailment): trim handler doc and round out viewer × auth-method matrix— small post-review polish: trim a 4-line doc comment to one line and add three missing viewer-via-API-key cases on Start/Stop so the override-gate test matrix is symmetric across auth methods.refactor(curtailment): rename AdminTransitionEvent to AdminTerminateEvent— addresses the inline review nit. The RPC'starget_stateCEL is restricted to terminal states only, soTerminatedescribes operational behavior more accurately thanTransition(which describes the underlying state-machine mechanism). Mechanical rename across the proto, generated outputs, handler, tests, and SessionOnly registration.refactor(curtailment): apply review feedback— review-driven hardening pass. Addsidempotency_key(= 4) toAdminTerminateEventRequest; addsbuf.validate{gte: 1, lte: ...}sanity ceilings on the two overrideuint32fields; remaps the handler-side admin gate's missing-session case fromInternaltoUnauthenticated(tests updated accordingly); cross-references the paired AdminTerminateEvent role gates in code comments at both sites so neither layer is removed independently.What is intentionally not in this PR
PreviewCurtailmentPlan,StartCurtailment,UpdateCurtailmentEvent,StopCurtailment,GetActiveCurtailment,ListCurtailmentEvents— those land in subsequent persistence, dispatch, and read-API tickets.AdminTerminateEventbusiness logic — handler stub returnsUnimplementedafter the role gate. The transition logic andcurtailment_admin_terminateactivity emission land with the read-API ticket.candidate_min_power_w_override— the proto field exists; the persistence/preview ticket consumes it.restore_batch_size_overrideoverrides — lands with the restore ticket.max_duration_default_secnormalization driven byallow_unbounded— lands with the start/dispatch ticket.AdminTerminateEventaccept API-key callers with an explicitcurtailment.adminscope — separate larger feature.domain/authhelper — separate small refactor PR.Test plan
buf lintcleannpm run lint(eslint, max-warnings 0) cleangolangci-lint runclean across all three lint targets (server,plugin/proto,plugin/antminer)go test ./internal/handlers/curtailment/...go test ./internal/handlers/interceptors/...go test ./internal/domain/activity/models/...just genregenerates Go + TypeScript outputs cleanly with no orphan diff (proto + generated outputs are in the same commit perAGENTS.mdrule 2).TestHandler_OverrideFieldsRoleGatecovers 16 cases — the full(Preview / Start-candidate / Start-allow_unbounded / Stop) × (session / API-key) × (viewer / admin)matrix: viewer is rejected on every override path regardless of auth method; admin reachesUnimplementedon every override path regardless of auth method. Verifies the override gate is keyed on role, not on auth method.TestHandler_NoOverrideSkipsRoleGateconfirms Preview / Stop without overrides reachUnimplementedregardless of session info.TestHandler_AdminTerminateEventRoleGatecovers admin / super-admin / viewer / empty-role paths againstAdminTerminateEventdirectly.TestHandler_AdminTerminateEventValidationcovers the buf.validate constraints:target_staterejects all six non-CANCELLED/FAILED values (UNSPECIFIED,PENDING,ACTIVE,RESTORING,COMPLETED,COMPLETED_WITH_FAILURES);event_uuidandreasonmin_len=1enforced. Validator-passed cases now surfaceCodeUnauthenticatedfrom the handler-side admin gate (no session in test context), matching the remapped error code.TestHandler_AdminTerminateEventRejectsMissingSessionasserts the handler-side admin gate returnsCodeUnauthenticatedwhen nosession.Infois in context.TestAuthInterceptor_AdminTerminateEventRejectsApiKeyAuthexercises the runtimeauthenticate()path with a Bearer header againstAdminTerminateEventand assertsPermissionDenied.TestCurtailmentAdminProcedureIsSessionOnlyandTestCurtailmentNonAdminProceduresStayApiKeyAccessiblepin the registration list both directions.Closes #171
Refs #116
Refs #118
🤖 Generated with Claude Code