Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 15 additions & 2 deletions charts/maestrod/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,27 @@
# Changelog

- [Changelog](#changelog)
- [0.7.0 (2026-06-17)](#070-2026-06-17)
- [Added](#added)
- [0.6.2 (2026-06-12)](#062-2026-06-12)
- [Changed](#changed)
- [0.6.1 (2026-05-30)](#061-2026-05-30)
- [Changed](#changed-1)
- [0.6.0 (2026-05-29)](#060-2026-05-29)
- [Added](#added)
- [0.5.0 (2026-05-27)](#050-2026-05-27)
- [Added](#added-1)
- [0.5.0 (2026-05-27)](#050-2026-05-27)
- [Added](#added-2)

## 0.7.0 (2026-06-17)

### Added

- Grafana dashboard for Maestrod, delivered as a sidecar-discovered ConfigMap
(`observability.metrics.grafanaDashboard`, disabled by default). Panels cover
per-route AI token usage (`nutrient.ai.tokens_total`), AI call latency and
reliability, per-route HTTP RED metrics, vision quality/throughput, and process
health. Requires the `serviceMonitor` (or another scrape path) and a Grafana
sidecar watching the `grafana_dashboard` label.

## 0.6.2 (2026-06-12)

Expand Down
2 changes: 1 addition & 1 deletion charts/maestrod/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ type: application
description: Maestrod, the orchestration backend for Nutrient managed cloud workloads.
home: https://www.nutrient.io
icon: https://cdn.prod.website-files.com/65fdb7696055f07a05048833/66e58e33c3880ff24aa34027_nutrient-logo.png
version: 0.6.2
version: 0.7.0
appVersion: "1.1.3"

keywords:
Expand Down
82 changes: 44 additions & 38 deletions charts/maestrod/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

> [!WARNING] This chart is made for internal use by Nutrient.

![Version: 0.6.2](https://img.shields.io/badge/Version-0.6.2-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.1.3](https://img.shields.io/badge/AppVersion-1.1.3-informational?style=flat-square)
![Version: 0.7.0](https://img.shields.io/badge/Version-0.7.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 1.1.3](https://img.shields.io/badge/AppVersion-1.1.3-informational?style=flat-square)

Maestrod, the orchestration backend for Nutrient managed cloud workloads.

Expand Down Expand Up @@ -221,6 +221,12 @@ namespace.
|-----|-------------|---------|
| [`observability`](./values.yaml#L209) | Observability settings for Maestrod. | [...](./values.yaml#L209) |
| [`observability.metrics`](./values.yaml#L213) | Metrics integration settings. | [...](./values.yaml#L213) |
| [`observability.metrics.grafanaDashboard`](./values.yaml#L251) | Grafana dashboard delivered as a sidecar-discovered ConfigMap. Requires the `serviceMonitor` (or another scrape path) so Prometheus has Maestrod's `/metrics`, and a Grafana sidecar watching ConfigMaps with the `grafana_dashboard` label. | [...](./values.yaml#L251) |
| [`observability.metrics.grafanaDashboard.configMap`](./values.yaml#L258) | ConfigMap parameters. | [...](./values.yaml#L258) |
| [`observability.metrics.grafanaDashboard.configMap.labels`](./values.yaml#L262) | ConfigMap labels. The Grafana sidecar discovers dashboards by the `grafana_dashboard` label; keep it set unless your sidecar uses a different selector. | `{"grafana_dashboard":"1"}` |
| [`observability.metrics.grafanaDashboard.enabled`](./values.yaml#L254) | Create the Grafana dashboard ConfigMap for Maestrod. | `false` |
| [`observability.metrics.grafanaDashboard.tags`](./values.yaml#L271) | Dashboard tags. | `["Nutrient","maestrod"]` |
| [`observability.metrics.grafanaDashboard.title`](./values.yaml#L268) | Dashboard title. | `*generated*` |
| [`observability.metrics.serviceMonitor`](./values.yaml#L218) | Prometheus [ServiceMonitor](https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#monitoring.coreos.com/v1.ServiceMonitor) scraping Maestrod's `/metrics` endpoint on the existing `http` Service port. | [...](./values.yaml#L218) |
| [`observability.metrics.serviceMonitor.enabled`](./values.yaml#L221) | Create a Prometheus Operator ServiceMonitor for Maestrod. | `false` |
| [`observability.metrics.serviceMonitor.honorLabels`](./values.yaml#L242) | Honor labels from scraped metrics. | `false` |
Expand All @@ -236,53 +242,53 @@ namespace.

| Key | Description | Default |
|-----|-------------|---------|
| [`lifecycle`](./values.yaml#L296) | [Container lifecycle hooks](https://kubernetes.io/docs/tasks/configure-pod-container/attach-handler-lifecycle-event/). | `{}` |
| [`livenessProbe`](./values.yaml#L268) | [Liveness probe](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/) against Maestrod's `/health` HTTP endpoint. Polls less often than readiness and is more forgiving — a failure restarts the container, so this should only trip on true deadlock. Set `livenessProbe: {}` to disable. | [...](./values.yaml#L268) |
| [`readinessProbe`](./values.yaml#L282) | [Readiness probe](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/) against Maestrod's `/health` HTTP endpoint. Set `readinessProbe: {}` to disable. | [...](./values.yaml#L282) |
| [`startupProbe`](./values.yaml#L253) | [Startup probe](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/) against Maestrod's `/health` HTTP endpoint. Generous `failureThreshold` so a slow initial boot doesn't get killed (10 s × 30 = 5 min budget). Set `startupProbe: {}` to disable. | [...](./values.yaml#L253) |
| [`terminationGracePeriodSeconds`](./values.yaml#L293) | [Termination grace period](https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/). | `30` |
| [`lifecycle`](./values.yaml#L324) | [Container lifecycle hooks](https://kubernetes.io/docs/tasks/configure-pod-container/attach-handler-lifecycle-event/). | `{}` |
| [`livenessProbe`](./values.yaml#L296) | [Liveness probe](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/) against Maestrod's `/health` HTTP endpoint. Polls less often than readiness and is more forgiving — a failure restarts the container, so this should only trip on true deadlock. Set `livenessProbe: {}` to disable. | [...](./values.yaml#L296) |
| [`readinessProbe`](./values.yaml#L310) | [Readiness probe](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/) against Maestrod's `/health` HTTP endpoint. Set `readinessProbe: {}` to disable. | [...](./values.yaml#L310) |
| [`startupProbe`](./values.yaml#L281) | [Startup probe](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/) against Maestrod's `/health` HTTP endpoint. Generous `failureThreshold` so a slow initial boot doesn't get killed (10 s × 30 = 5 min budget). Set `startupProbe: {}` to disable. | [...](./values.yaml#L281) |
| [`terminationGracePeriodSeconds`](./values.yaml#L321) | [Termination grace period](https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/). | `30` |

### Scheduling

| Key | Description | Default |
|-----|-------------|---------|
| [`affinity`](./values.yaml#L372) | Node affinity. | `{}` |
| [`autoscaling`](./values.yaml#L303) | [HorizontalPodAutoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/). When `enabled: true`, the chart's HPA controls the replica count and `replicaCount` is ignored. | [...](./values.yaml#L303) |
| [`autoscaling.behavior`](./values.yaml#L321) | HPA [scaling behaviour](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#configurable-scaling-behavior). | `{}` |
| [`autoscaling.enabled`](./values.yaml#L306) | Enable the HPA. | `false` |
| [`autoscaling.maxReplicas`](./values.yaml#L312) | Maximum replicas. | `10` |
| [`autoscaling.minReplicas`](./values.yaml#L309) | Minimum replicas. | `1` |
| [`autoscaling.targetCPUUtilizationPercentage`](./values.yaml#L315) | Target average CPU utilisation (percentage). `null` disables the metric. | `nil` |
| [`autoscaling.targetMemoryUtilizationPercentage`](./values.yaml#L318) | Target average memory utilisation (percentage). `null` disables the metric. | `nil` |
| [`nodeSelector`](./values.yaml#L369) | [Node selector](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/). | `{}` |
| [`podDisruptionBudget`](./values.yaml#L356) | [PodDisruptionBudget](https://kubernetes.io/docs/tasks/run-application/configure-pdb/). When both `minAvailable` and `maxUnavailable` are non-empty, `maxUnavailable` wins (the two fields are mutually exclusive in Kubernetes). Either field accepts an integer (e.g. `1`) or a percentage string (e.g. `"50%"`). | [...](./values.yaml#L356) |
| [`podDisruptionBudget.create`](./values.yaml#L359) | Create a PodDisruptionBudget for Maestrod. | `false` |
| [`podDisruptionBudget.maxUnavailable`](./values.yaml#L365) | `spec.maxUnavailable`. Integer or percentage string. Takes precedence over `minAvailable`. | `""` |
| [`podDisruptionBudget.minAvailable`](./values.yaml#L362) | `spec.minAvailable`. Integer or percentage string. Ignored when `maxUnavailable` is set. | `1` |
| [`priorityClassName`](./values.yaml#L381) | [PriorityClass](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/) name. | `""` |
| [`replicaCount`](./values.yaml#L335) | Number of replicas. Ignored when `autoscaling.enabled` is `true`. | `3` |
| [`resources`](./values.yaml#L325) | [Resources](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/). | `{"limits":{"cpu":"4","memory":"8Gi"},"requests":{"cpu":"4","memory":"8Gi"}}` |
| [`revisionHistoryLimit`](./values.yaml#L348) | [Revision history limit](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#clean-up-policy). | `1` |
| [`schedulerName`](./values.yaml#L384) | [Scheduler](https://kubernetes.io/docs/concepts/scheduling-eviction/kube-scheduler/) name. | `""` |
| [`tolerations`](./values.yaml#L375) | [Node tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/). | `[]` |
| [`topologySpreadConstraints`](./values.yaml#L378) | [Topology spread constraints](https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/). | `[]` |
| [`updateStrategy`](./values.yaml#L341) | [Update strategy](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#strategy). `rollingUpdate.maxSurge` and `rollingUpdate.maxUnavailable` are `IntOrString` in Kubernetes — both an integer (e.g. `1`) and a percentage string (e.g. `"25%"`) are accepted. | `{"rollingUpdate":{"maxSurge":1,"maxUnavailable":0},"type":"RollingUpdate"}` |
| [`affinity`](./values.yaml#L400) | Node affinity. | `{}` |
| [`autoscaling`](./values.yaml#L331) | [HorizontalPodAutoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/). When `enabled: true`, the chart's HPA controls the replica count and `replicaCount` is ignored. | [...](./values.yaml#L331) |
| [`autoscaling.behavior`](./values.yaml#L349) | HPA [scaling behaviour](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#configurable-scaling-behavior). | `{}` |
| [`autoscaling.enabled`](./values.yaml#L334) | Enable the HPA. | `false` |
| [`autoscaling.maxReplicas`](./values.yaml#L340) | Maximum replicas. | `10` |
| [`autoscaling.minReplicas`](./values.yaml#L337) | Minimum replicas. | `1` |
| [`autoscaling.targetCPUUtilizationPercentage`](./values.yaml#L343) | Target average CPU utilisation (percentage). `null` disables the metric. | `nil` |
| [`autoscaling.targetMemoryUtilizationPercentage`](./values.yaml#L346) | Target average memory utilisation (percentage). `null` disables the metric. | `nil` |
| [`nodeSelector`](./values.yaml#L397) | [Node selector](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/). | `{}` |
| [`podDisruptionBudget`](./values.yaml#L384) | [PodDisruptionBudget](https://kubernetes.io/docs/tasks/run-application/configure-pdb/). When both `minAvailable` and `maxUnavailable` are non-empty, `maxUnavailable` wins (the two fields are mutually exclusive in Kubernetes). Either field accepts an integer (e.g. `1`) or a percentage string (e.g. `"50%"`). | [...](./values.yaml#L384) |
| [`podDisruptionBudget.create`](./values.yaml#L387) | Create a PodDisruptionBudget for Maestrod. | `false` |
| [`podDisruptionBudget.maxUnavailable`](./values.yaml#L393) | `spec.maxUnavailable`. Integer or percentage string. Takes precedence over `minAvailable`. | `""` |
| [`podDisruptionBudget.minAvailable`](./values.yaml#L390) | `spec.minAvailable`. Integer or percentage string. Ignored when `maxUnavailable` is set. | `1` |
| [`priorityClassName`](./values.yaml#L409) | [PriorityClass](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/) name. | `""` |
| [`replicaCount`](./values.yaml#L363) | Number of replicas. Ignored when `autoscaling.enabled` is `true`. | `3` |
| [`resources`](./values.yaml#L353) | [Resources](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/). | `{"limits":{"cpu":"4","memory":"8Gi"},"requests":{"cpu":"4","memory":"8Gi"}}` |
| [`revisionHistoryLimit`](./values.yaml#L376) | [Revision history limit](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#clean-up-policy). | `1` |
| [`schedulerName`](./values.yaml#L412) | [Scheduler](https://kubernetes.io/docs/concepts/scheduling-eviction/kube-scheduler/) name. | `""` |
| [`tolerations`](./values.yaml#L403) | [Node tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/). | `[]` |
| [`topologySpreadConstraints`](./values.yaml#L406) | [Topology spread constraints](https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/). | `[]` |
| [`updateStrategy`](./values.yaml#L369) | [Update strategy](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#strategy). `rollingUpdate.maxSurge` and `rollingUpdate.maxUnavailable` are `IntOrString` in Kubernetes — both an integer (e.g. `1`) and a percentage string (e.g. `"25%"`) are accepted. | `{"rollingUpdate":{"maxSurge":1,"maxUnavailable":0},"type":"RollingUpdate"}` |

### Restart job

| Key | Description | Default |
|-----|-------------|---------|
| [`restartJob`](./values.yaml#L391) | Optional CronJob that polls the configured image registry for a new digest on the running `image.tag` and patches the Maestrod Deployment with a refresh annotation to trigger a rollout. Disabled by default. | [...](./values.yaml#L391) |
| [`restartJob.affinity`](./values.yaml#L431) | Affinity for the restart-job pod. | `{}` |
| [`restartJob.enabled`](./values.yaml#L394) | Enable the restart-job CronJob and its supporting RBAC/ServiceAccount. | `false` |
| [`restartJob.image`](./values.yaml#L402) | Image for the restart-job container. Must contain `kubectl`, `curl`, `jq`, and `bash` — `alpine/k8s` covers all four. | [...](./values.yaml#L402) |
| [`restartJob.nodeSelector`](./values.yaml#L425) | Node selector for the restart-job pod. | `{}` |
| [`restartJob.podAnnotations`](./values.yaml#L413) | Pod annotations for the restart-job pod. | `{"skip-auto-labelling":"true"}` |
| [`restartJob.podLabels`](./values.yaml#L417) | Pod labels for the restart-job pod. | `{}` |
| [`restartJob.registryAuthSecretName`](./values.yaml#L410) | Name of a pre-existing `kubernetes.io/dockerconfigjson` Secret holding the registry credentials used to query the image manifest. Required when `restartJob.enabled: true`; rendering fails otherwise. | `""` |
| [`restartJob.schedule`](./values.yaml#L397) | CronJob schedule. | `"*/10 * * * *"` |
| [`restartJob.serviceAccount`](./values.yaml#L421) | ServiceAccount for the restart-job pod. | [...](./values.yaml#L421) |
| [`restartJob.tolerations`](./values.yaml#L428) | Tolerations for the restart-job pod. | `[]` |
| [`restartJob`](./values.yaml#L419) | Optional CronJob that polls the configured image registry for a new digest on the running `image.tag` and patches the Maestrod Deployment with a refresh annotation to trigger a rollout. Disabled by default. | [...](./values.yaml#L419) |
| [`restartJob.affinity`](./values.yaml#L459) | Affinity for the restart-job pod. | `{}` |
| [`restartJob.enabled`](./values.yaml#L422) | Enable the restart-job CronJob and its supporting RBAC/ServiceAccount. | `false` |
| [`restartJob.image`](./values.yaml#L430) | Image for the restart-job container. Must contain `kubectl`, `curl`, `jq`, and `bash` — `alpine/k8s` covers all four. | [...](./values.yaml#L430) |
| [`restartJob.nodeSelector`](./values.yaml#L453) | Node selector for the restart-job pod. | `{}` |
| [`restartJob.podAnnotations`](./values.yaml#L441) | Pod annotations for the restart-job pod. | `{"skip-auto-labelling":"true"}` |
| [`restartJob.podLabels`](./values.yaml#L445) | Pod labels for the restart-job pod. | `{}` |
| [`restartJob.registryAuthSecretName`](./values.yaml#L438) | Name of a pre-existing `kubernetes.io/dockerconfigjson` Secret holding the registry credentials used to query the image manifest. Required when `restartJob.enabled: true`; rendering fails otherwise. | `""` |
| [`restartJob.schedule`](./values.yaml#L425) | CronJob schedule. | `"*/10 * * * *"` |
| [`restartJob.serviceAccount`](./values.yaml#L449) | ServiceAccount for the restart-job pod. | [...](./values.yaml#L449) |
| [`restartJob.tolerations`](./values.yaml#L456) | Tolerations for the restart-job pod. | `[]` |

## Contribution

Expand Down
Loading
Loading