[maestrod] Add Grafana dashboard (per-route AI token usage + RED + vision)#146
Open
olihou wants to merge 6 commits into
Open
[maestrod] Add Grafana dashboard (per-route AI token usage + RED + vision)#146olihou wants to merge 6 commits into
olihou wants to merge 6 commits into
Conversation
…sion) Mirror the document-engine Grafana-sidecar pattern for the maestrod chart: - dashboards/maestrod-single-namespace.json: 18-panel dashboard across four rows — AI tokens & cost (per-route nutrient.ai.tokens_total, MEAI token distribution, AI latency p50/p95, attempts/empty/warnings), per-route HTTP RED, vision quality/throughput, and process health (working-set memory + CPU only). Uses the standard placeholder tokens. - templates/monitoring/grafana-dashboard.ConfigMap.yaml: ConfigMap labeled grafana_dashboard:"1" for sidecar auto-discovery; replaces placeholders and indents the dashboard JSON. - values.yaml + values.schema.json: observability.metrics.grafanaDashboard block (disabled by default) with configMap.labels, title, tags. - Bump chart 0.6.2 -> 0.7.0; CHANGELOG entry. Requires the existing serviceMonitor (or another scrape path) and a Grafana sidecar watching the grafana_dashboard label. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
helm-docs output for the new observability.metrics.grafanaDashboard block; fixes the `generate` CI check (README drift). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Rename the per-route token table's raw label columns to readable headers (operation->Route, gen_ai_request_model->Model, gen_ai_token_type->Token type, Value->Tokens (range total)); drop the Time column and order them. - Add a descriptive y-axis label to every timeseries panel (tokens/s, latency (s), error ratio, requests/s, cores, count / page, …) so each graph states what it plots alongside the unit. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The dashboard never computed price — rename the "AI tokens & cost" row to "AI token usage" and reword the range-total panel to make clear it shows token counts only, no cost/pricing applied. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
tomassurin
approved these changes
Jun 19, 2026
| data: | ||
| maestrod-{{ printf "%s-%s" .Release.Namespace .Release.Name }}.json: |- | ||
| {{ .Files.Get "dashboards/maestrod-single-namespace.json" | ||
| | replace "<<<<DASHBOARD_TITLE>>>>" (tpl .Values.observability.metrics.grafanaDashboard.title $) |
Contributor
There was a problem hiding this comment.
observability.metrics.grafanaDashboard.title and tags are user-facing values, but values containing JSON-significant characters break the dashboard payload. A valid Helm value such as title: Maestrod "prod" or a tag containing " renders a ConfigMap whose JSON cannot be imported by the Grafana sidecar.
Evidence: the template inserts raw strings into JSON placeholders via replace/join, while the dashboard keeps those placeholders inside JSON string/array literals.
Fix: JSON-encode these values instead of raw replacement. For example, replace the whole tags literal with a toJson value, and render the title placeholder from an escaped/encoded string.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds a Grafana dashboard to the
maestrodchart, mirroring thedocument-engineGrafana-sidecar pattern (theserviceMonitoralready exists).dashboards/maestrod-single-namespace.json— 14 panels across four rows:nutrient.ai.tokens_total), MEAI token distribution, AI call latency p50/p95, attempts/empty/warnings. Token counts only — no cost/pricing applied./run/*route.templates/monitoring/grafana-dashboard.ConfigMap.yaml— ConfigMap labeledgrafana_dashboard: "1"for sidecar auto-discovery;.Files.Get+ placeholderreplace+indent 4.values.yaml/values.schema.json—observability.metrics.grafanaDashboardblock (disabled by default):enabled,configMap.labels,title,tags.README.md— regenerated via helm-docs for the new values.Panel labels are written to be self-explanatory: every timeseries carries a descriptive y-axis label (
tokens/s,latency (s),error ratio,requests/s,cores,count / page, …) and the per-route token table renames raw Prometheus labels to readable column headers (Route / Model / Token type / Tokens (range total)).How to enable
observability.metrics.serviceMonitor.enabled=true(Prometheus scrapes/metrics).observability.metrics.grafanaDashboard.enabled=true(renders the labeled ConfigMap).grafana_dashboardlabel auto-imports it.Validation
No
helmbinary on the authoring host — validated by parsingvalues.schema.jsonand simulating the ConfigMap's placeholder substitution (incl. thetagsjoin): the rendered dashboard is valid JSON, all 14 panels reference real metric families emitted by maestrod's/metrics.README.mdregenerated with helm-docs 1.14.2 so thegenerateCI check passes.Notes
nutrient.ai.tokens_totalwith anoperationlabel) is added by the companion daemon PR: PSPDFKit/GdPicture#2804. The two per-route token panels stay empty until a daemon build carrying that change is deployed; the by-model MEAI token panel (gen_ai_client_token_usage) works today.Companion
Daemon-side metric: PSPDFKit/GdPicture#2804.
🤖 Generated with Claude Code