Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
321 changes: 228 additions & 93 deletions products/kubernetes-operator/guides/configuration.mdx

Large diffs are not rendered by default.

17 changes: 17 additions & 0 deletions products/kubernetes-operator/guides/introduction.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ doc_type: 'guide'
This document provides an overview of key concepts and usage patterns for the ClickHouse Operator.

## What is the ClickHouse Operator {#what-is-the-clickhouse-operator}

The ClickHouse Operator is a Kubernetes operator that automates the deployment and management of ClickHouse clusters on Kubernetes. Built using the operator pattern, it extends the Kubernetes API with custom resources that represent ClickHouse clusters and their dependencies.

The operator handles:
Expand All @@ -21,9 +22,11 @@ The operator handles:
- Storage provisioning

## Custom resources {#custom-resources}

The operator provides two main custom resource definitions (CRDs):

### ClickHouseCluster {#clickhousecluster}

Represents a ClickHouse database cluster with configurable replicas and shards.

```yaml
Expand All @@ -43,6 +46,7 @@ spec:
```

### KeeperCluster {#keepercluster}

Represents a ClickHouse Keeper cluster for distributed coordination (ZooKeeper replacement).

```yaml
Expand All @@ -59,11 +63,14 @@ spec:
```

## Coordination {#coordination}

### ClickHouse Keeper is required {#clickhouse-keeper-is-required}

Every ClickHouseCluster requires a ClickHouse Keeper cluster for distributed coordination.
The Keeper cluster must be referenced in the ClickHouseCluster spec using `keeperClusterRef`. By default the operator looks in the ClickHouseCluster namespace, but you can also set `keeperClusterRef.namespace` to point at a KeeperCluster in another watched namespace.

### One-to-One Keeper relationship {#one-to-one-keeper-relationship}

Each ClickHouseCluster must have its own dedicated KeeperCluster. You can't share a single KeeperCluster between multiple ClickHouseClusters.

**Why?** The operator automatically generates a unique authentication key for each ClickHouseCluster to access its Keeper. This key is stored in a Secret and can't be shared.
Expand All @@ -86,9 +93,11 @@ When recreating a cluster:
To avoid authentication errors, either delete the Persistent Volumes manually or recreate both clusters together with fresh storage.

## Schema Replication {#schema-replication}

The ClickHouse Operator automatically replicates database definitions across all replicas in a cluster.

### What Gets Replicated {#what-gets-replicated}

The operator synchronizes:
- [Replicated](/reference/engines/database-engines/replicated) database definitions
- Integration database engines (PostgreSQL, MySQL, etc.)
Expand All @@ -99,6 +108,7 @@ The operator does **not** synchronize:
- Table data (handled by ClickHouse replication)

### Recommended: Use Replicated database engine {#recommended-use-replicated-database-engine}

<Tip>
**Best practice**

Expand All @@ -118,18 +128,22 @@ CREATE DATABASE my_database ON CLUSTER 'default' ENGINE = Replicated;
```

### Avoid non-Replicated engines {#avoid-non-replicated-engines}

Non-replicated database engines (Atomic, Lazy, SQLite, Ordinary) require manual schema management:
- Tables must be created individually on each replica
- Schema drift can occur between nodes
- Operator can't automatically sync new replicas

### Disable schema replication {#disable-schema-replication}

To disable automatic schema replication, set `spec.settings.enableDatabaseSync` to `false` in the ClickHouseCluster resource.

## Storage management {#storage-management}

The operator manages storage through Kubernetes PersistentVolumeClaims (PVCs).

### Data volume configuration {#data-volume-configuration}

Specify storage requirements in `dataVolumeClaimSpec`:

```yaml
Expand All @@ -142,6 +156,7 @@ spec:
```

### Storage Lifecycle {#storage-lifecycle}

- **Creation**: PVCs are created automatically with the cluster
- **Expansion**: Supported if StorageClass allows volume expansion
- **Retention**: PVCs are **not** deleted automatically on cluster deletion
Expand All @@ -161,6 +176,7 @@ kubectl delete pvc -l app.kubernetes.io/instance=my-cluster-clickhouse
```

## Default configuration highlights {#default-configuration-highlights}

* **Pre-configured Cluster:** Cluster named 'default' containing all ClickHouse nodes.
* **Default macros:** Some useful macros are pre-defined:
- `{cluster}`: Cluster name (`default`)
Expand All @@ -170,5 +186,6 @@ kubectl delete pvc -l app.kubernetes.io/instance=my-cluster-clickhouse
* **Replicated storage for User Defined Functions(UDF)**

## Next steps {#next-steps}

- [Configuration Guide](/products/kubernetes-operator/guides/configuration) - Detailed configuration options
- [API Reference](/products/kubernetes-operator/reference/api-reference) - Complete API documentation
19 changes: 18 additions & 1 deletion products/kubernetes-operator/guides/monitoring.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
position: 3
position: 4
slug: /clickhouse-operator/guides/monitoring
title: Monitoring the ClickHouse Operator
keywords: ['kubernetes', 'prometheus', 'monitoring', 'metrics']
Expand All @@ -16,6 +16,7 @@ This guide is about the **operator process itself** (the controller manager). Fo
</Note>

## Endpoints {#endpoints}

The operator process exposes two HTTP endpoints inside the manager pod:

| Endpoint | Default port | Path | Purpose |
Expand All @@ -28,6 +29,7 @@ The metrics endpoint is **off by default** when running the operator binary dire
The health probe endpoint is always on; the deployment template wires `/healthz` and `/readyz` to the pod's liveness and readiness probes on port `8081`.

## Operator binary flags {#operator-binary-flags}

The relevant `manager` flags (defined in [`cmd/main.go`](https://github.com/ClickHouse/clickhouse-operator/blob/main/cmd/main.go)):

| Flag | Default | Description |
Expand All @@ -46,6 +48,7 @@ The `8443` (HTTPS) / `8080` (HTTP) convention in the flag's help text is only a
</Note>

## Enable metrics via Helm {#enable-metrics-via-helm}

The chart already creates a `Service` for the metrics port and, optionally, a `ServiceMonitor` for prometheus-operator.

The metrics endpoint itself is on by default (`metrics.enable: true`, port `8080`, served over HTTPS via `metrics.secure: true`). The only setting you typically need to flip is `prometheus.enable` to have the chart create a `ServiceMonitor` for you:
Expand Down Expand Up @@ -90,6 +93,7 @@ After install the chart creates:
- `ClusterRole/<resource-prefix>-metrics-reader` — non-resource URL `/metrics` with `get` verb.

## Securing the metrics endpoint {#securing-the-metrics-endpoint}

When `metrics.secure: true` the metrics server enforces TLS **and** Kubernetes authentication/authorization on every scrape. Scrapers must:

1. Present a valid Kubernetes bearer token.
Expand Down Expand Up @@ -131,6 +135,7 @@ If you see `401 Unauthorized` or `403 Forbidden` from the metrics endpoint, the
</Warning>

## ServiceMonitor reference {#servicemonitor-reference}

The chart renders a ServiceMonitor of this shape when `prometheus.enable: true`:

```yaml
Expand Down Expand Up @@ -168,9 +173,11 @@ spec:
If your Prometheus instance does not run cert-manager, set `tlsConfig.insecureSkipVerify: true` and rely on bearer-token authentication only — the chart already does this when `certManager.enable: false`.

## Standalone Prometheus example {#standalone-prometheus-example}

If you do not use kube-prometheus-stack, the repository ships a self-contained example at [`examples/prometheus_secure_metrics_scraper.yaml`](https://github.com/ClickHouse/clickhouse-operator/blob/main/examples/prometheus_secure_metrics_scraper.yaml). It creates a ServiceAccount, the necessary RBAC, and a `Prometheus` CR that selects the operator's ServiceMonitor.

## Health probe endpoints {#health-probe-endpoints}

| Path | Used by | Returns |
|---|---|---|
| `/healthz` | Kubernetes liveness probe | `200 OK` as long as the probe server is listening. |
Expand Down Expand Up @@ -198,9 +205,11 @@ readinessProbe:
A repeatedly failing probe usually means the probe server itself never came up — for example, the manager exited early during startup. Check the manager logs for `unable to start manager`, RBAC failures, or `cache did not sync` errors.

## Metrics catalog {#metrics-catalog}

The operator does not register custom Prometheus collectors. Everything below is exposed by the underlying `controller-runtime` and `client-go` libraries. The most useful series, grouped by purpose:

### Reconciliation activity {#reconciliation-activity}

| Metric | Type | Labels |
|---|---|---|
| `controller_runtime_reconcile_total` | counter | `controller`, `result` (`success` / `error` / `requeue` / `requeue_after`) |
Expand All @@ -212,6 +221,7 @@ The operator does not register custom Prometheus collectors. Everything below is
The `controller` label is derived by `controller-runtime` from the resource type registered with `For(...)`. With the current code in `internal/controller/clickhouse` and `internal/controller/keeper` this resolves to `clickhousecluster` and `keepercluster` respectively. If you have customized the operator, verify with a one-time scrape of `/metrics`.

### Work queue {#work-queue}

| Metric | Type | Labels |
|---|---|---|
| `workqueue_depth` | gauge | `name`, `controller`, `priority` |
Expand All @@ -225,21 +235,25 @@ The `controller` label is derived by `controller-runtime` from the resource type
The `name` and `controller` labels carry the same value (the controller name).

### API server traffic {#api-server-traffic}

| Metric | Type | Labels |
|---|---|---|
| `rest_client_requests_total` | counter | `code`, `method`, `host` |

### Leader election {#leader-election}

| Metric | Type | Labels |
|---|---|---|
| `leader_election_master_status` | gauge | `name` (= `d4ceba06.clickhouse.com`) |

The Helm chart enables `--leader-elect` by default, so this metric is present in standard Helm installs. When running the binary directly without the flag, the metric is absent.

### Runtime {#runtime}

Standard Go process and runtime collectors — `go_goroutines`, `go_memstats_*`, `process_cpu_seconds_total`, `process_resident_memory_bytes`, etc.

## Useful PromQL queries {#useful-promql-queries}

### Health overview

```promql
Expand Down Expand Up @@ -282,6 +296,7 @@ sum(leader_election_master_status{name="d4ceba06.clickhouse.com"})
```

## Suggested alerts {#suggested-alerts}

Starting point for a PrometheusRule (tune thresholds for your environment):

```yaml
Expand Down Expand Up @@ -330,6 +345,7 @@ groups:
The last rule is only meaningful when leader election is enabled.

## Verifying the setup {#verifying-the-setup}

A quick end-to-end check, assuming the chart was installed in `clickhouse-operator-system`:

```bash
Expand Down Expand Up @@ -357,5 +373,6 @@ kubectl -n $NS run curl-metrics --rm -it --restart=Never \
If the scrape returns metrics in the Prometheus exposition format, the endpoint and RBAC are correctly wired.

## Related guides {#related-guides}

- [Installation](/products/kubernetes-operator/install/helm) — Helm values relevant to monitoring.
- [Configuration](/products/kubernetes-operator/guides/configuration) — TLS configuration shared with the metrics server.
7 changes: 6 additions & 1 deletion products/kubernetes-operator/guides/scaling.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
position: 4
position: 5
slug: /clickhouse-operator/guides/scaling
title: Scaling clusters
keywords: ['kubernetes', 'scaling', 'replicas', 'shards', 'keeper', 'quorum']
Expand All @@ -16,6 +16,7 @@ A `ClickHouseCluster` always needs a Keeper, referenced through the required `sp
</Note>

## Scaling replicas {#scaling-replicas}

`spec.replicas` sets the number of replicas in every shard. Each replica runs in its own StatefulSet named `<cluster>-clickhouse-<shard>-<replica>`, so a cluster with `shards: 2` and `replicas: 3` runs six StatefulSets.

Raise or lower the count in place:
Expand All @@ -30,6 +31,7 @@ spec:
On scale up the operator creates the new per-replica StatefulSets, waits for each pod to become ready, and then synchronizes the schema to the new replicas (see [Automatic schema sync](#automatic-schema-sync)). On scale down it removes the surplus StatefulSets and cleans up the stale replicated-database replica registrations the removed replicas left behind.

## Scaling shards {#scaling-shards}

`spec.shards` sets the number of shards. Each new shard adds a full set of per-replica StatefulSets, and the operator creates one [PodDisruptionBudget per shard](/products/kubernetes-operator/guides/configuration#pod-disruption-budgets) so a disruption in one shard cannot count against another.

```yaml
Expand All @@ -41,6 +43,7 @@ spec:
Each shard holds a distinct slice of the data, and the operator does not copy or move rows between shards. A `Distributed` table or an explicit routing scheme decides which shard a row lands on, so adding a shard gives new writes somewhere to land without touching the rows already stored in the existing shards.

## Automatic schema sync {#automatic-schema-sync}

When `spec.settings.enableDatabaseSync` is `true` (the default), the operator keeps the schema aligned as the topology changes:

- **On scale up** — once at least two replicas are ready, the operator replicates the database definitions to the newly created replicas, so a fresh replica joins with the same `Replicated` and integration databases as the rest of the cluster.
Expand All @@ -51,6 +54,7 @@ This covers `Replicated` databases and integration database engines. It does not
Set `enableDatabaseSync: false` to turn the behavior off, for example when an external tool owns schema propagation. The operator then reports the `SchemaSyncDisabled` reason on the `SchemaInSync` condition.

## Conditions to watch {#scaling-conditions}

Inspect progress on the Custom Resource while a scale operation runs:

```bash
Expand All @@ -72,6 +76,7 @@ kubectl get clickhousecluster sample -o yaml | sed -n '/conditions:/,/^[^ ]/p'
A scale operation is complete when `ClusterSizeAligned` reports `UpToDate`, `SchemaInSync` reports `ReplicasInSync`, and `Ready` reports `AllShardsReady`.

## Scaling Keeper {#scaling-keeper}

A `KeeperCluster` runs a RAFT quorum, so the operator changes its membership **one replica at a time** and only while the cluster is in a stable state. This protects the quorum: a `2F+1` cluster tolerates `F` members down, so a 3-node cluster keeps working with one member missing and a 5-node cluster with two.

```yaml
Expand Down
Loading