fix(deploy/observability): parse boundary_request JSON in firewall dashboard by ausbru87 · Pull Request #43 · coder/usgov-coderdemo

ausbru87 · 2026-06-09T14:21:26Z

Summary

The Agent Firewall (Coder Boundary) Grafana dashboard showed "No data" on its egress audit panels even though coderd is actively emitting boundary_request events for the austenplatform/firewall-test workspace.

Root cause: the five Loki audit panels parsed the log line with | logfmt, but coderd emits boundary_request as JSON with the audit fields nested under fields (for example fields.decision, fields.http_url). logfmt extracted nothing, so the decision, owner, template_id, and template_version_id label filters never matched and every panel returned empty. The domain/path regexp also assumed the logfmt http_url= form, which does not exist in the JSON line.

Changes

deploy/observability/dashboards-boundary.yaml: switch the five audit panels (Request Totals, Top Allowed Domains, Top Denied Domains, Most recent allowed requests, Most recent denied requests) from | logfmt to the LogQL json parser with explicit field extraction (decision=fields.decision, `owner=`fields.owner, etc.). Update the domain/path regexp to match the JSON-quoted http_url value. Label names are preserved, so the existing field overrides, transformations, and line_format templates need no changes.
Correct the stale "not emitted yet / read empty by design" wording in the dashboard header text panel, the YAML comment block, and deploy/observability/AI_GOVERNANCE_DASHBOARD.md.

No datasource, scrape, or promtail changes were needed: the datasource UIDs (loki, prometheus, aibridge-postgres) already match, Loki ingests coderd logs under namespace="coder", and the Prometheus throughput metric agent_boundary_log_proxy_batches_forwarded_total is scraped and correct.

Root cause and live verification evidence

Live boundary_request line as stored in Loki (JSON, fields nested):

{"ts":"...","msg":"boundary_request","fields":{"owner":"austenplatform","workspace_name":"firewall-test","decision":"allow","http_url":"https://...","http_method":"POST",...}}

Broken vs fixed Loki query (1h range, firewall-test):

# BROKEN (old): returns []
sum by (decision) (count_over_time({namespace=~`(coder|coder-workspaces)`} |= `boundary_request` | logfmt | decision=~`deny|allow` [1h]))

# FIXED (new): returns allow + deny series
sum by (decision) (count_over_time({namespace=~`(coder|coder-workspaces)`} |= `boundary_request` | json decision=`fields.decision` | decision=~`deny|allow` [1h]))

End-to-end validation on the usgov-coderdemo stack:

Embedded dashboard JSON parses cleanly (16 panels); Grafana accepted a temp import of the corrected dashboard.
Replaying the corrected stored exprs against Loki returned data for all five panels, e.g. Most recent denied requests parsed GET raw.githubusercontent.com/anthropics/claude-code/.../CHANGELOG.md (deny).
The live provisioned dashboard was updated (ConfigMap applied, sidecar reloaded): 0 audit panels on logfmt, 5 on json.
Grafana's own /api/ds/query datasource proxy rendered decision=allow (value 3) and decision=deny (value 19) for the corrected dashboard.

Generated by Coder Agents, on behalf of @ausbru87.

…shboard The Agent Firewall dashboard's Loki audit panels showed "No data" because they parsed coderd's boundary_request lines with `| logfmt`. coderd emits those lines as JSON with the audit fields nested under "fields", so logfmt extracted nothing and the decision/owner/template filters never matched. Switch the five audit panels (Request Totals, Top Allowed/Denied Domains, Most recent allowed/denied requests) to the LogQL `json` parser with explicit field extraction (decision=`fields.decision`, owner=`fields.owner`, ...), and update the domain/path regexp to match the JSON-quoted `http_url` value rather than the logfmt `http_url=` form. Label names are preserved, so panel overrides, transformations, and line_format templates are unchanged. Also correct the stale "not emitted yet / read empty by design" wording in the dashboard header and AI_GOVERNANCE_DASHBOARD.md: boundary_request events are emitted and the panels now show live allow/deny data. Verified live against the usgov-coderdemo stack: Loki returns allow and deny series for the firewall-test workspace, and Grafana's datasource proxy renders allow=3 / deny=19 for the corrected dashboard. Generated by Coder Agents.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(deploy/observability): parse boundary_request JSON in firewall dashboard#43

fix(deploy/observability): parse boundary_request JSON in firewall dashboard#43
ausbru87 wants to merge 1 commit into
mainfrom
ws-2x/fix-boundary-dashboard

ausbru87 commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ausbru87 commented Jun 9, 2026

Summary

Changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant