Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 8 additions & 7 deletions deploy/observability/AI_GOVERNANCE_DASHBOARD.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,13 +135,14 @@ traffic flows; this is expected, not an error:
`aibridge_user_prompts` / `aibridge_tool_usages` yet).
- **Firewall Sessions** (stat and table) read 0 (no rows in `boundary_sessions`
yet).
- The Agent Firewall log stream currently carries only Boundary proxy lifecycle
lines. The upstream `coder/observability` boundary dashboard parses
`boundary_request` allow / deny audit events from the `coderd.agentrpc`
logger; those events are not emitted in this stack until egress traffic is
audited, so allow / deny breakdown panels are intentionally not included here
yet. They become populatable once Boundary audits real egress (and would also
benefit from a newer Coder that logs `boundary_request`).
- The Agent Firewall log stream (namespace `coder-workspaces`) carries Boundary
proxy lifecycle lines. The allow / deny audit breakdown is driven separately
by coderd's structured `boundary_request` log lines (namespace `coder`),
which Loki ingests as JSON with the audit fields nested under `fields`. The
Agent Firewall dashboard parses them with the LogQL `json` parser
(`fields.decision`, `fields.owner`, `fields.http_url`, and related fields),
so the **Egress Audit (allow / deny)** panels show live data while the
firewalled workspaces generate egress.

Panels that already have data: provider health and inventory, total
interceptions, active sessions, unique users, interceptions by provider / model /
Expand Down
21 changes: 12 additions & 9 deletions deploy/observability/dashboards-boundary.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,12 @@
# label here; the boundary_request line filter is the reliable narrow.
# - Helm template includes resolved to literals (non-workspace-selector,
# dashboard-range, dashboard-refresh) since this is rendered JSON, not a chart.
# - The upstream allow/deny audit panels need live audited egress to populate;
# boundary_request events are not emitted in this stack yet (placeholder AI
# key, no audited egress, older Coder). They are correct, not broken.
# - The allow/deny audit panels parse coderd's structured boundary_request log
# lines from Loki. coderd emits each line as JSON with the audit fields
# nested under "fields", so the LogQL uses the `json` parser (extracting
# fields.decision, fields.owner, fields.http_url, etc.) rather than `logfmt`,
# and the domain regexp matches the JSON-quoted http_url value. The
# firewalled workspaces emit live allow and deny events, so these populate.
# - Operations row (batch counters, active agents, sessions, proxy log stream)
# is retained from the in-repo dashboard so the view has populated panels
# today: Prometheus (agent_boundary_*) and Loki proxy lifecycle have data;
Expand Down Expand Up @@ -162,7 +165,7 @@ data:
},
"options": {
"mode": "markdown",
"content": "**Agent Firewall** (Coder's Boundary) audits and controls outbound network activity from Coder workspaces, giving security teams visibility into what AI agents reach.\n\n- **Egress Audit (allow / deny)**: per-request allow and deny decisions parsed from Loki `boundary_request` audit events (adapted from the upstream coder/observability boundary dashboard). These populate once Boundary audits real egress; until then they read empty by design, not error.\n- **Agent Firewall Operations**: forwarded log-proxy batch counters and active agents from Prometheus (agent_boundary_*), Boundary sessions from the Coder database, and the live proxy log stream from Loki.\n\nDisplay terminology is product-facing (\"Agent Firewall\"); PromQL still references agent_boundary_*, LogQL still matches the literal log text \"boundary\" / \"boundary_request\", and the Coder database keeps its boundary_* table names."
"content": "**Agent Firewall** (Coder's Boundary) audits and controls outbound network activity from Coder workspaces, giving security teams visibility into what AI agents reach.\n\n- **Egress Audit (allow / deny)**: per-request allow and deny decisions parsed from Loki `boundary_request` audit events (adapted from the upstream coder/observability boundary dashboard). These show live allow and deny decisions parsed from coderd's structured boundary_request events; the firewalled workspaces are actively audited.\n- **Agent Firewall Operations**: forwarded log-proxy batch counters and active agents from Prometheus (agent_boundary_*), Boundary sessions from the Coder database, and the live proxy log stream from Loki.\n\nDisplay terminology is product-facing (\"Agent Firewall\"); PromQL still references agent_boundary_*, LogQL still matches the literal log text \"boundary\" / \"boundary_request\", and the Coder database keeps its boundary_* table names."
}
},
{
Expand Down Expand Up @@ -260,7 +263,7 @@ data:
},
"direction": "backward",
"editorMode": "code",
"expr": "sum by (decision) (count_over_time({namespace=~`(coder|coder-workspaces)`} |= `boundary_request` | logfmt | decision=~`deny|allow` | owner=~`$owner` | domain=~`$domain` | template_id=~`$template_id` | template_version_id=~`$template_version_id` [$__range]))",
"expr": "sum by (decision) (count_over_time({namespace=~`(coder|coder-workspaces)`} |= `boundary_request` | json decision=`fields.decision`, owner=`fields.owner`, template_id=`fields.template_id`, template_version_id=`fields.template_version_id` | decision=~`deny|allow` | owner=~`$owner` | template_id=~`$template_id` | template_version_id=~`$template_version_id` | regexp `\"http_url\":\"(?P<scheme>https?)://(?P<domain>[^/:\"]+)` | domain=~`$domain` [$__range]))",
"queryType": "range",
"refId": "A"
}
Expand Down Expand Up @@ -339,7 +342,7 @@ data:
},
"direction": "backward",
"editorMode": "code",
"expr": "topk(20, sum by (domain) (count_over_time({namespace=~`(coder|coder-workspaces)`} |= `boundary_request` | logfmt | decision=`allow` | owner=~`$owner` | template_id=~`$template_id` | template_version_id=~`$template_version_id` | regexp `http_url=(?P<scheme>https?)://(?P<domain>[^/:]+)` | domain=~`$domain` [$__auto])))",
"expr": "topk(20, sum by (domain) (count_over_time({namespace=~`(coder|coder-workspaces)`} |= `boundary_request` | json decision=`fields.decision`, owner=`fields.owner`, template_id=`fields.template_id`, template_version_id=`fields.template_version_id` | decision=`allow` | owner=~`$owner` | template_id=~`$template_id` | template_version_id=~`$template_version_id` | regexp `\"http_url\":\"(?P<scheme>https?)://(?P<domain>[^/:\"]+)` | domain=~`$domain` [$__auto])))",
"legendFormat": "",
"queryType": "instant",
"refId": "A"
Expand Down Expand Up @@ -448,7 +451,7 @@ data:
},
"direction": "backward",
"editorMode": "code",
"expr": "topk(20, sum by (domain) (count_over_time({namespace=~`(coder|coder-workspaces)`} |= `boundary_request` | logfmt | decision=`deny` | owner=~`$owner` | template_id=~`$template_id` | template_version_id=~`$template_version_id` | regexp `http_url=(?P<scheme>https?)://(?P<domain>[^/:]+)` | domain=~`$domain` [$__auto])))",
"expr": "topk(20, sum by (domain) (count_over_time({namespace=~`(coder|coder-workspaces)`} |= `boundary_request` | json decision=`fields.decision`, owner=`fields.owner`, template_id=`fields.template_id`, template_version_id=`fields.template_version_id` | decision=`deny` | owner=~`$owner` | template_id=~`$template_id` | template_version_id=~`$template_version_id` | regexp `\"http_url\":\"(?P<scheme>https?)://(?P<domain>[^/:\"]+)` | domain=~`$domain` [$__auto])))",
"legendFormat": "",
"queryType": "instant",
"refId": "A"
Expand Down Expand Up @@ -550,7 +553,7 @@ data:
},
"direction": "backward",
"editorMode": "code",
"expr": "{namespace=~`(coder|coder-workspaces)`} |= `boundary_request` | logfmt | decision=`allow` | owner=~`$owner` | template_id=~`$template_id` | template_version_id=~`$template_version_id` | regexp `http_url=https?://(?P<domain>[^/?# ]+)(?P<path>/[^?# ]*)?` | domain=~`$domain` | line_format `time=\"{{ .event_time }}\" method=\"{{ .http_method }}\" domain=\"{{ .domain }}\" path=\"{{ .path }}\" owner=\"{{ .owner }}\" workspace_name=\"{{ .workspace_name }}\" template_id=\"{{ .template_id }}\" template_version_id=\"{{ .template_version_id }}\"`",
"expr": "{namespace=~`(coder|coder-workspaces)`} |= `boundary_request` | json decision=`fields.decision`, owner=`fields.owner`, workspace_name=`fields.workspace_name`, template_id=`fields.template_id`, template_version_id=`fields.template_version_id`, http_method=`fields.http_method`, event_time=`fields.event_time` | decision=`allow` | owner=~`$owner` | template_id=~`$template_id` | template_version_id=~`$template_version_id` | regexp `\"http_url\":\"https?://(?P<domain>[^/?#:\" ]+)(?P<path>/[^?#\" ]*)?` | domain=~`$domain` | line_format `time=\"{{ .event_time }}\" method=\"{{ .http_method }}\" domain=\"{{ .domain }}\" path=\"{{ .path }}\" owner=\"{{ .owner }}\" workspace_name=\"{{ .workspace_name }}\" template_id=\"{{ .template_id }}\" template_version_id=\"{{ .template_version_id }}\"`",
"queryType": "range",
"refId": "A"
}
Expand Down Expand Up @@ -687,7 +690,7 @@ data:
},
"direction": "backward",
"editorMode": "code",
"expr": "{namespace=~`(coder|coder-workspaces)`} |= `boundary_request` | logfmt | decision=`deny` | owner=~`$owner` | template_id=~`$template_id` | template_version_id=~`$template_version_id` | regexp `http_url=https?://(?P<domain>[^/?# ]+)(?P<path>/[^?# ]*)?` | domain=~`$domain` | line_format `time=\"{{ .event_time }}\" method=\"{{ .http_method }}\" domain=\"{{ .domain }}\" path=\"{{ .path }}\" owner=\"{{ .owner }}\" workspace_name=\"{{ .workspace_name }}\" template_id=\"{{ .template_id }}\" template_version_id=\"{{ .template_version_id }}\"`",
"expr": "{namespace=~`(coder|coder-workspaces)`} |= `boundary_request` | json decision=`fields.decision`, owner=`fields.owner`, workspace_name=`fields.workspace_name`, template_id=`fields.template_id`, template_version_id=`fields.template_version_id`, http_method=`fields.http_method`, event_time=`fields.event_time` | decision=`deny` | owner=~`$owner` | template_id=~`$template_id` | template_version_id=~`$template_version_id` | regexp `\"http_url\":\"https?://(?P<domain>[^/?#:\" ]+)(?P<path>/[^?#\" ]*)?` | domain=~`$domain` | line_format `time=\"{{ .event_time }}\" method=\"{{ .http_method }}\" domain=\"{{ .domain }}\" path=\"{{ .path }}\" owner=\"{{ .owner }}\" workspace_name=\"{{ .workspace_name }}\" template_id=\"{{ .template_id }}\" template_version_id=\"{{ .template_version_id }}\"`",
"queryType": "range",
"refId": "A"
}
Expand Down