feat(rfc-0001): gateway proto + agent schema + stub handler [phase 1]#170
Conversation
🔐 Codex Security Review
Review SummaryOverall Risk: MEDIUM Findings[MEDIUM]
|
Addresses security review feedback on PR #170 and the failing generated-code-check CI job. Schema (migration 000039): - agent and device gain composite UNIQUE (id, org_id) so other tables can build (id, org_id) FKs against them. - agent_device, device.agent_id, discovered_device.agent_id all use composite FKs against agent(id, org_id), so the DB rejects any binding that crosses tenants. - queue_message gains a denormalized org_id BIGINT NULL with a CHECK that agent_id requires org_id, plus a composite FK to agent(id, org_id). - command_batch_log reuses its existing organization_id (000036) for the composite FK and gains the same agent_id-requires-org CHECK. Logging (server/internal/handlers/interceptors): - agentgateway Register / BeginAuthHandshake / CompleteAuthHandshake / UploadHeartbeat are added to RedactedRequestProcedures. - CompleteAuthHandshake response is added to RedactedResponseProcedures. - ControlStream, UploadTelemetry, UploadEvents are added to SensitiveBodyProcedures so unary logs suppress the body entirely. - request_logging.go: extend loggingStreamingHandlerConn to honor SensitiveBodyProcedures, suppressing per-message bodies on Receive/Send for sensitive streaming RPCs. Authentication (server/internal/handlers/interceptors/config.go): - All seven agent-gateway RPCs added to UnauthenticatedProcedures so the handler is reachable as advertised (returns CodeUnimplemented rather than CodeUnauthenticated). Comment explicitly notes Phase 2 will replace this with a dedicated agent-auth interceptor. Generated code: - goimports applied to agentgateway pb.go and connect.go (the diff CI's generated-code-check was flagging). - sqlc regenerated; AgentDevice model gains OrgID, QueueMessage gains OrgID (sql.NullInt64). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ment_token Addresses additional security review feedback on PR #170. [MEDIUM] Upload RPCs were trusting a caller-supplied agent_id rather than the handshake-derived session token. The proto now drops agent_id from UploadTelemetryRequest, UploadEventsRequest, UploadHeartbeatRequest, and ControlHello. The new file-level comment block specifies the auth contract: every post-handshake RPC carries the session_token in Authorization metadata, and server-side handlers derive the effective agent identity (and tenant) from the validated token. Spoofing agent_id is no longer expressible on the wire. [MEDIUM] Register had no tenant-binding input despite the org-scoped agent schema. RegisterRequest now carries enrollment_token (operator- issued, single-use, org-scoped) as field 1; existing fields shift down by one. The field is the org binding for the agent before the handshake that issues an api_key. Note: these are field-number changes to a Phase 1 stub with no real clients yet. The wire shape is intentionally locked in before Phase 2 implementation begins so consumer code never has to migrate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Phase 1 scaffold for RFC-0001’s agent↔server split: introduces the AgentGateway protobuf surface, wires it into the server mux/reflect, adds initial DB schema for agents and device ownership, and tightens request/stream logging redaction for the new RPCs.
Changes:
- Added
AgentGatewayServiceproto definition and regenerated Go/TS client/server stubs. - Registered a stub (unimplemented) AgentGateway handler on the server, and added reflection + logging/redaction configuration for the new procedures.
- Added migration creating
agent/agent_devicetables (and required composite uniqueness ondevice(id, org_id)).
Reviewed changes
Copilot reviewed 8 out of 13 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| server/sdk/v1/python/proto_fleet_sdk/generated/pb/driver_pb2_grpc.py | Regenerated Python grpc stub metadata version constant. |
| server/migrations/000039_create_agent_tables.up.sql | Adds agent/agent_device tables + device composite uniqueness needed for tenant-aligned FKs. |
| server/migrations/000039_create_agent_tables.down.sql | Rollback for the new agent tables and device constraint. |
| server/internal/handlers/interceptors/request_logging.go | Suppresses per-message streaming bodies for procedures marked sensitive. |
| server/internal/handlers/interceptors/config.go | Adds agentgateway procedures to redaction/sensitivity configuration and unauth allowlist. |
| server/internal/handlers/agentgateway/handler.go | New stub handler embedding Connect’s unimplemented service. |
| server/generated/sqlc/models.go | Regenerated sqlc models for new agent tables. |
| server/generated/grpc/agentgateway/v1/agentgatewayv1connect/agentgateway.connect.go | Regenerated Connect handlers/clients for AgentGatewayService. |
| server/generated/grpc/agentgateway/v1/agentgateway.pb.go | Regenerated Go protobuf types for agentgateway. |
| server/cmd/fleetd/main.go | Registers AgentGateway handler on mux and adds it to reflection. |
| server/cmd/fleetd/config.go | Adds --mode enum flag to configuration (server/agent/combined). |
| proto/agentgateway/v1/agentgateway.proto | New agentgateway service and message schema (handshake + telemetry/events + control stream). |
| client/src/protoFleet/api/generated/agentgateway/v1/agentgateway_pb.ts | Regenerated TS protobuf types for agentgateway. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3869df410e
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: aa0a96a514
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Lands Phase 1 of RFC 0001 (the agent + cloud-server split): the wire protocol scaffold, agent identity schema, and a stub handler that returns CodeUnimplemented for every RPC. No behavior change in any deployment shape (combined / server / agent). Wire protocol (proto/agentgateway/v1/agentgateway.proto): Register, BeginAuthHandshake, CompleteAuthHandshake, UploadTelemetry, UploadEvents, UploadHeartbeat, ControlStream. Post-handshake RPCs derive agent identity from a session_token in Authorization metadata rather than any body field. RegisterRequest carries an operator-issued, org-scoped enrollment_token. buf.validate rules pin ed25519 key/signature lengths, bound api_key/session_token/name, cap opaque payloads at 1 MiB, require timestamps, and require a populated ControlStream oneof variant. Handler (server/internal/handlers/agentgateway): Embeds UnimplementedAgentGatewayServiceHandler. Registered on the shared mux and added to grpcreflect. All seven RPCs are in UnauthenticatedProcedures because the user-session AuthInterceptor cannot validate the agent's session_token; the handler is responsible for credential validation when implemented. Logging redaction (server/internal/handlers/interceptors): Handshake request/response procedures added to redaction lists; ControlStream, UploadTelemetry, UploadEvents added to SensitiveBodyProcedures. The streaming logger now suppresses per-message bodies for sensitive procedures. Schema (server/migrations/000039_create_agent_tables): agent_device is the single source of truth for device-to-agent ownership; combined mode is the absence of a row. agent_device carries (agent_id, device_id, org_id) with composite FKs to both agent(id, org_id) and device(id, org_id), so cross-tenant pairings are rejected by the DB. agent identity_pubkey and (org_id, name) uniqueness are partial indexes scoped to deleted_at IS NULL, so a soft-deleted agent does not block re-enrollment. created_at and updated_at are NOT NULL. --mode flag (server/cmd/fleetd/config.go): Accepted via kong's enum validator (server/agent/combined, default combined). Not yet load-bearing. Closes #157 Refs #145 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
36d8a08 to
24ec5c0
Compare
Summary
Lands Phase 1 of RFC 0001 — the wire-protocol scaffold and schema for the agent + cloud-server split. No behavior change in any deployment shape (combined / server / agent).
proto/agentgateway/v1/agentgateway.protodefinesAgentGatewayServicewithRegister,BeginAuthHandshake,CompleteAuthHandshake,UploadTelemetry,UploadEvents,UploadHeartbeat, and a bidiControlStream. Post-handshake RPCs derive agent identity from a session token inAuthorizationmetadata, not body fields.server/internal/handlers/agentgatewayembeds the generatedUnimplementedAgentGatewayServiceHandler, so every RPC returnsCodeUnimplemented. The service is registered on the mux and added to grpcreflect alongside the pairing and telemetry services. All seven RPCs are added toUnauthenticatedProceduresso the user-session interceptor doesn't reject them; Phase 2 will replace this with a dedicated agent-auth interceptor.000039_create_agent_tablesaddsagentandagent_device.agent_deviceis the single source of truth for device→agent ownership, carrying(agent_id, device_id, org_id)with composite FKs againstagent(id, org_id)anddevice(id, org_id)so the DB enforces tenant alignment. Combined mode is the natural absence of a row inagent_device; the four leaf tables that Phase 2 will route through stay at their current shapes.--modeflag accepted inserver/cmd/fleetd/config.govia kong'senumvalidator (server/agent/combined, defaultcombined). Not yet load-bearing.RedactedRequestProcedures/RedactedResponseProcedures;ControlStream,UploadTelemetry,UploadEventsadded toSensitiveBodyProceduresand the streaming logger now suppresses per-message bodies for sensitive procedures.What is intentionally not in this PR
Unimplementeduntil Phase 2.--mode. Combined-mode behavior is unchanged.agentoragent_device. Phase 2 will add them when the agent binary actually starts using them.Test plan
go build ./...clean fromserver/go test ./... -skip ./generated/... -count=1 -shortpasses against the local timescaledbmigrate up. Verifiedagentandagent_deviceexist anddevicecarriesuq_device_id_org_id; the four leaf tables show no new columns or constraints.--modeflag appears infleetd --helpwith defaultcombined; invalid values rejected with--mode must be one of "server","agent","combined"golangci-lint runandbuf lintcleanjust devcanary to confirm virtual miner telemetry still flows. Requires Docker plugin build; not exercised on this branch.Closes #157
Refs #145
🤖 Generated with Claude Code