fix(deploy/observability): parse boundary_request JSON in firewall dashboard#43
Open
ausbru87 wants to merge 1 commit into
Open
fix(deploy/observability): parse boundary_request JSON in firewall dashboard#43ausbru87 wants to merge 1 commit into
ausbru87 wants to merge 1 commit into
Conversation
…shboard The Agent Firewall dashboard's Loki audit panels showed "No data" because they parsed coderd's boundary_request lines with `| logfmt`. coderd emits those lines as JSON with the audit fields nested under "fields", so logfmt extracted nothing and the decision/owner/template filters never matched. Switch the five audit panels (Request Totals, Top Allowed/Denied Domains, Most recent allowed/denied requests) to the LogQL `json` parser with explicit field extraction (decision=`fields.decision`, owner=`fields.owner`, ...), and update the domain/path regexp to match the JSON-quoted `http_url` value rather than the logfmt `http_url=` form. Label names are preserved, so panel overrides, transformations, and line_format templates are unchanged. Also correct the stale "not emitted yet / read empty by design" wording in the dashboard header and AI_GOVERNANCE_DASHBOARD.md: boundary_request events are emitted and the panels now show live allow/deny data. Verified live against the usgov-coderdemo stack: Loki returns allow and deny series for the firewall-test workspace, and Grafana's datasource proxy renders allow=3 / deny=19 for the corrected dashboard. Generated by Coder Agents.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The Agent Firewall (Coder Boundary) Grafana dashboard showed "No data" on its egress audit panels even though coderd is actively emitting
boundary_requestevents for theaustenplatform/firewall-testworkspace.Root cause: the five Loki audit panels parsed the log line with
| logfmt, but coderd emitsboundary_requestas JSON with the audit fields nested underfields(for examplefields.decision,fields.http_url).logfmtextracted nothing, so thedecision,owner,template_id, andtemplate_version_idlabel filters never matched and every panel returned empty. The domain/pathregexpalso assumed the logfmthttp_url=form, which does not exist in the JSON line.Changes
deploy/observability/dashboards-boundary.yaml: switch the five audit panels (Request Totals, Top Allowed Domains, Top Denied Domains, Most recent allowed requests, Most recent denied requests) from| logfmtto the LogQLjsonparser with explicit field extraction (decision=fields.decision, `owner=`fields.owner, etc.). Update the domain/path regexp to match the JSON-quotedhttp_urlvalue. Label names are preserved, so the existing field overrides, transformations, andline_formattemplates need no changes.deploy/observability/AI_GOVERNANCE_DASHBOARD.md.No datasource, scrape, or promtail changes were needed: the datasource UIDs (
loki,prometheus,aibridge-postgres) already match, Loki ingests coderd logs undernamespace="coder", and the Prometheus throughput metricagent_boundary_log_proxy_batches_forwarded_totalis scraped and correct.Root cause and live verification evidence
Live
boundary_requestline as stored in Loki (JSON, fields nested):Broken vs fixed Loki query (1h range, firewall-test):
End-to-end validation on the usgov-coderdemo stack:
GET raw.githubusercontent.com/anthropics/claude-code/.../CHANGELOG.md(deny).logfmt, 5 onjson./api/ds/querydatasource proxy rendereddecision=allow(value 3) anddecision=deny(value 19) for the corrected dashboard.Generated by Coder Agents, on behalf of @ausbru87.