diff --git a/products/kubernetes-operator/guides/configuration.mdx b/products/kubernetes-operator/guides/configuration.mdx index 58290241..fce9d115 100644 --- a/products/kubernetes-operator/guides/configuration.mdx +++ b/products/kubernetes-operator/guides/configuration.mdx @@ -10,7 +10,9 @@ doc_type: 'guide' This guide covers how to configure ClickHouse and Keeper clusters using the operator. ## ClickHouseCluster configuration {#clickhousecluster-configuration} + ### Basic configuration {#basic-configuration} + ```yaml apiVersion: clickhouse.com/v1alpha1 kind: ClickHouseCluster @@ -28,6 +30,7 @@ spec: ``` ### Replicas and shards {#replicas-and-shards} + - **Replicas**: Number of ClickHouse instances per shard (for high availability) - **Shards**: Number of horizontal partitions (for scaling) @@ -40,6 +43,7 @@ spec: A cluster with `replicas: 3` and `shards: 2` will create 6 ClickHouse pods total. ### Keeper integration {#keeper-integration} + Every ClickHouse cluster must reference a KeeperCluster for coordination: ```yaml @@ -52,6 +56,7 @@ spec: When `keeperClusterRef.namespace` is set, the operator must watch both namespaces. If `WATCH_NAMESPACE` is configured, include the ClickHouse and Keeper namespaces in that list. ## KeeperCluster configuration {#keepercluster-configuration} + ```yaml apiVersion: clickhouse.com/v1alpha1 kind: KeeperCluster @@ -66,7 +71,10 @@ spec: ``` ## Storage configuration {#storage-configuration} -Configure persistent storage: + +Configure persistent storage with `dataVolumeClaimSpec`, a standard Kubernetes +`PersistentVolumeClaimSpec`. The operator turns it into a per-replica PersistentVolumeClaim +mounted at the data path `/var/lib/clickhouse`: ```yaml spec: @@ -78,11 +86,47 @@ spec: ``` -Operator can modify existing PVC only if the underlying storage class supports volume expansion. +The operator can modify an existing PVC only if the underlying StorageClass supports volume expansion. + + +Attaching extra disks in a multi-disk (JBOD) layout, running without a persistent +volume, expanding capacity, custom storage policies, and the rules for what cannot +change after creation are covered in the dedicated +[Storage and volumes guide](/products/kubernetes-operator/guides/storage). + +## Cluster domain {#cluster-domain} + +`spec.clusterDomain` sets the Kubernetes DNS suffix the operator uses when it builds +the fully-qualified pod host names it writes into the ClickHouse server +configuration. It defaults to `cluster.local` and exists on both +`ClickHouseCluster` and `KeeperCluster`. + +```yaml +spec: + clusterDomain: cluster.local # default; override only for a custom domain +``` + +The operator addresses every pod through the headless Service as +`...svc.`. That suffix flows into +two parts of the generated configuration: + +- On a `ClickHouseCluster`, its value is used for the replica host names in + `remote_servers` (cross-replica and `Distributed` queries). +- On a `KeeperCluster`, its value builds the Keeper node host names that + ClickHouse uses for coordination. + + +Only override this when your cluster's `kubelet` runs with a `--cluster-domain` +other than `cluster.local`. If the value does not match the real cluster domain, +ClickHouse cannot resolve the Keeper and replica host names — coordination and +`Distributed` queries fail with DNS resolution errors. Set the **same** value on the +`ClickHouseCluster` and the `KeeperCluster` it references. ## Pod configuration {#pod-configuration} + ### Automatic topology spread and affinity {#automatic-topology-spread-and-affinity} + Distribute pods across availability zones: ```yaml @@ -97,6 +141,7 @@ Ensure your Kubernetes cluster has enough nodes in different zones to satisfy th ### Manual configuration {#manual-configuration} + Arbitrary pod affinity/anti-affinity rules and topology spread constraints can be specified. ```yaml @@ -111,23 +156,26 @@ spec: ### See [API Reference](/products/kubernetes-operator/reference/api-reference#podtemplatespec) for all supported Pod template options. ## Pod disruption budgets {#pod-disruption-budgets} + The operator creates a [PodDisruptionBudget](https://kubernetes.io/docs/concepts/workloads/pods/disruptions/) (PDB) for each cluster so that voluntary disruptions — node drains, rolling upgrades, autoscaler evictions — cannot take down enough pods to lose quorum or break availability. For ClickHouse clusters with more than one shard, **one PDB is created per shard** so a disruption in one shard cannot count against another. ### Defaults {#pdb-defaults} + The operator picks safe defaults based on the cluster size so that a fresh `apply` already protects against accidental quorum loss. -| Resource | Topology | Default PDB | -|---|---|---| -| `ClickHouseCluster` | `replicas: 1` (single-replica shard) | `maxUnavailable: 1` — disruption is allowed for a single-node cluster so that node drains are not blocked | -| `ClickHouseCluster` | `replicas: 2+` (multi-replica shard) | `minAvailable: 1` — at least one replica per shard must stay up | -| `KeeperCluster` | `replicas: 1` | `maxUnavailable: 1` — disruption is allowed for a single-node cluster so that node drains are not blocked | -| `KeeperCluster` | `replicas: 3+` | `maxUnavailable: replicas/2` — preserves the RAFT quorum for a `2F+1` cluster (3 replicas tolerate 1 down, 5 replicas tolerate 2 down) | +| Resource | Topology | Default PDB | +|---------------------|--------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------| +| `ClickHouseCluster` | `replicas: 1` (single-replica shard) | `maxUnavailable: 1` — disruption is allowed for a single-node cluster so that node drains are not blocked | +| `ClickHouseCluster` | `replicas: 2+` (multi-replica shard) | `minAvailable: 1` — at least one replica per shard must stay up | +| `KeeperCluster` | `replicas: 1` | `maxUnavailable: 1` — disruption is allowed for a single-node cluster so that node drains are not blocked | +| `KeeperCluster` | `replicas: 3+` | `maxUnavailable: replicas/2` — preserves the RAFT quorum for a `2F+1` cluster (3 replicas tolerate 1 down, 5 replicas tolerate 2 down) | For a 3-shard ClickHouseCluster with `replicas: 3`, the operator creates three PDBs, one per shard, each with `minAvailable: 1`. ### Overriding the defaults {#pdb-overrides} + Use `spec.podDisruptionBudget` to override either `minAvailable` **or** `maxUnavailable` (exactly one): ```yaml @@ -161,13 +209,14 @@ spec: ``` ### Policies {#pdb-policies} + `spec.podDisruptionBudget.policy` lets you choose **how aggressively** the operator manages PDBs: -| Policy | Behavior | -|---|---| -| `Enabled` (default) | The operator creates and updates the PDB on every reconcile. This is the safe production default. | -| `Disabled` | The operator does **not** create PDBs and **deletes** any existing ones with matching labels. Useful for development clusters where every voluntary disruption should be allowed. | -| `Ignored` | The operator neither creates nor deletes PDBs. Existing PDBs are left alone. Use this when another system (e.g. policy admission, GitOps tool) owns PDB management for you. | +| Policy | Behavior | +|---------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `Enabled` (default) | The operator creates and updates the PDB on every reconcile. This is the safe production default. | +| `Disabled` | The operator does **not** create PDBs and **deletes** any existing ones with matching labels. Useful for development clusters where every voluntary disruption should be allowed. | +| `Ignored` | The operator neither creates nor deletes PDBs. Existing PDBs are left alone. Use this when another system (e.g. policy admission, GitOps tool) owns PDB management for you. | Example — disable PDB management completely on a development cluster: @@ -186,6 +235,7 @@ spec: ``` ### Cluster-wide opt-out {#pdb-cluster-wide-disable} + PDB management can also be disabled cluster-wide via the operator's `ENABLE_PDB` environment variable. With `ENABLE_PDB=false`, the operator skips the PDB reconcile step for **every** ClickHouseCluster and KeeperCluster regardless of their `spec.podDisruptionBudget.policy`, and **does not watch** `PodDisruptionBudget` resources at all. The operator's ServiceAccount therefore does not need RBAC permissions on `poddisruptionbudgets.policy/v1`, which is useful when running the operator under a restricted ServiceAccount that intentionally omits those permissions. ```yaml @@ -198,7 +248,9 @@ env: This is intended for environments that ship their own disruption policies (e.g. through Gatekeeper / Kyverno) and want the operator out of the loop entirely. ## Container configuration {#container-configuration} + ### Custom image {#custom-image} + Use a specific ClickHouse image: ```yaml @@ -211,6 +263,7 @@ spec: ``` ### Container resources {#container-resources} + Configure CPU and memory for ClickHouse containers: ```yaml @@ -227,6 +280,7 @@ spec: ``` ### Environment variables {#environment-variables} + Add custom environment variables: ```yaml @@ -238,6 +292,7 @@ spec: ``` ### Volume mounts {#volume-mounts} + Add additional volume mounts: ```yaml @@ -257,11 +312,9 @@ Operator will create projected volume with all specified mounts. ### See [API Reference](/products/kubernetes-operator/reference/api-reference#containertemplatespec) for all supported Container template options. ## TLS/SSL configuration {#tls-ssl-configuration} -For an end-to-end example — issuing certificates with cert-manager, connecting -clients over the secure ports, and encrypting Keeper traffic — see the -[Securing with TLS](/products/kubernetes-operator/guides/tls) guide. ### Configure secure endpoints {#configure-secure-endpoints} + Pass a reference to a Kubernetes Secret containing TLS certificates to enable secure endpoints ```yaml @@ -275,22 +328,22 @@ spec: ``` ### SSL certificate secret format {#ssl-certificate-secret-format} -It is expected that the Secret contains the following keys: + +It is expected that the Secret contains the server keypair: - `tls.crt` - PEM encoded server certificate - `tls.key` - PEM encoded private key -- `ca.crt` - PEM encoded CA certificate chain This format is compatible with cert-manager generated certificates. ### ClickHouse-Keeper communication over TLS {#clickhouse-keeper-communication-over-tls} + If KeeperCluster has TLS enabled, ClickHouseCluster would use secure connection to Keeper nodes automatically. -ClickHouseCluster should be able to verify Keeper nodes certificates. -If ClickHouseCluster has TLS enabled, is uses `ca.crt` bundle for verification. Otherwise, default CA bundle is used. +ClickHouseCluster verifies Keeper node certificates against the system trust store, plus any `caBundle` you configure. -User may provide a custom CA bundle reference: +To trust a private CA (for example, a self-signed or internal CA), provide a custom CA bundle reference: ```yaml spec: @@ -302,6 +355,7 @@ spec: ``` ## External Secret {#external-secret} + By default the operator creates and owns a Secret containing the cluster's internal credentials (interserver password, management password, keeper identity, cluster secret, named-collections key). The Secret is named after the cluster and lives in the cluster's namespace. If you want to manage these credentials yourself — for example, sourcing them from HashiCorp Vault, AWS Secrets Manager, or [External Secrets Operator](https://external-secrets.io/) — point the operator at a pre-existing Secret using `spec.externalSecret`: @@ -329,14 +383,15 @@ The referenced Secret must reside in the **same namespace** as the ClickHouseClu ### Required keys {#external-secret-required-keys} + The Secret must contain the following keys: -| Key | Format | When required | -|---|---|---| -| `interserver-password` | plaintext password | Always | -| `management-password` | plaintext password | Always | -| `keeper-identity` | `clickhouse:` | Always | -| `cluster-secret` | plaintext password | Always | +| Key | Format | When required | +|-------------------------|--------------------------------------------|----------------------------| +| `interserver-password` | plaintext password | Always | +| `management-password` | plaintext password | Always | +| `keeper-identity` | `clickhouse:` | Always | +| `cluster-secret` | plaintext password | Always | | `named-collections-key` | hex-encoded 16-byte AES key (32 hex chars) | ClickHouse `>= 25.12` only | A complete Secret looks like this: @@ -357,12 +412,13 @@ stringData: ``` ### Policy: Observe vs Manage {#external-secret-policy} + `spec.externalSecret.policy` controls how the operator handles missing required keys: -| Policy | Behavior on missing keys | -|---|---| -| `Observe` (default) | Reconciliation is **blocked** until every required key is present. The operator reports each missing key — and the format hint for it — via the `ExternalSecretValid` condition (with reason `ExternalSecretInvalid`) and a `Warning` event. | -| `Manage` | The operator **generates** any missing required keys and writes them back to the same Secret. Useful for bootstrapping: create an empty Secret, let the operator fill it, then optionally tighten access. The operator still never deletes the Secret. | +| Policy | Behavior on missing keys | +|---------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `Observe` (default) | Reconciliation is **blocked** until every required key is present. The operator reports each missing key — and the format hint for it — via the `ExternalSecretValid` condition (with reason `ExternalSecretInvalid`) and a `Warning` event. | +| `Manage` | The operator **generates** any missing required keys and writes them back to the same Secret. Useful for bootstrapping: create an empty Secret, let the operator fill it, then optionally tighten access. The operator still never deletes the Secret. | Even with `policy: Manage` the Secret must already exist in the namespace — the operator never creates the Secret itself, it only writes generated keys into an existing one. If the referenced Secret is missing, reconciliation is blocked with the `ExternalSecretNotFound` reason regardless of policy. @@ -371,6 +427,7 @@ Even with `policy: Manage` the Secret must already exist in the namespace — th Pick `Observe` when an external system (Vault, ESO, sealed-secrets, GitOps) is the source of truth and you want the operator to fail loudly on misconfiguration. Pick `Manage` when you want self-sufficient bootstrapping but still want to retain ownership of the Secret object itself (for example, to back it up). ### Status condition and troubleshooting {#external-secret-status} + The operator exposes a `ExternalSecretValid` condition on `ClickHouseCluster.status.conditions`. Inspect it when reconciliation looks stuck: ```bash @@ -386,11 +443,11 @@ kubectl get clickhousecluster sample -o jsonpath='{.status.conditions}' | jq Possible reasons: -| `reason` | Meaning | Fix | -|---|---|---| -| `ExternalSecretNotFound` | The referenced Secret does not exist in the namespace. | Create the Secret, or fix `spec.externalSecret.name`. | -| `ExternalSecretInvalid` | The Secret exists but lacks required keys (only with `Observe`). The message lists each missing key together with its expected format. | Add the missing keys, or switch to `policy: Manage`. | -| `ExternalSecretValid` | All required keys are present and the operator is using the Secret. | — | +| `reason` | Meaning | Fix | +|--------------------------|----------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------| +| `ExternalSecretNotFound` | The referenced Secret does not exist in the namespace. | Create the Secret, or fix `spec.externalSecret.name`. | +| `ExternalSecretInvalid` | The Secret exists but lacks required keys (only with `Observe`). The message lists each missing key together with its expected format. | Add the missing keys, or switch to `policy: Manage`. | +| `ExternalSecretValid` | All required keys are present and the operator is using the Secret. | — | The operator requeues reconciliation while the Secret is invalid, so once you add the missing keys the next reconcile picks them up automatically — no need to bounce pods. @@ -399,6 +456,7 @@ The set of required keys depends on the running ClickHouse version. `named-colle ## Additional ports {#additional-ports} + The operator exposes a fixed set of ports on every ClickHouse Pod and its headless Service: `8123` HTTP, `9000` native, `9009` interserver, `9001` management, `9363` Prometheus metrics, and the TLS variants `8443`/`9440` when TLS is enabled. To make ClickHouse listen on additional protocols — MySQL, PostgreSQL, gRPC, or any custom port — declare them in `spec.additionalPorts`: ```yaml @@ -419,6 +477,7 @@ The operator adds those ports to the Pod's `containerPorts` and to the headless ### End-to-end example: MySQL wire protocol {#additional-ports-mysql-example} + To expose ClickHouse over the MySQL wire protocol on port `9004`: ```yaml @@ -458,35 +517,37 @@ kubectl exec sample-clickhouse-0-0-0 -- \ ``` ### Field constraints {#additional-ports-constraints} -| Field | Rule | -|---|---| + +| Field | Rule | +|--------|------------------------------------------------------------------------------------------------------------------------------------------| | `name` | Must match the DNS_LABEL pattern `^[a-z]([-a-z0-9]*[a-z0-9])?$`, max 63 characters. Uniqueness is enforced by the CRD as a list-map key. | -| `port` | Integer in `[1, 65535]`. The webhook rejects duplicate port numbers within the list. | +| `port` | Integer in `[1, 65535]`. The webhook rejects duplicate port numbers within the list. | ### Reserved ports and names {#additional-ports-reserved} + The validating webhook rejects `additionalPorts` entries that would collide with ports the operator binds itself. All TLS-related ports are reserved **unconditionally** so that flipping `spec.settings.tls.enabled` later cannot break a previously valid cluster. -| Port | Reserved for | -|---|---| -| `8123` | HTTP | -| `8443` | HTTPS | -| `9000` | native TCP | -| `9440` | native TLS | -| `9009` | interserver | -| `9001` | management | +| Port | Reserved for | +|--------|--------------------| +| `8123` | HTTP | +| `8443` | HTTPS | +| `9000` | native TCP | +| `9440` | native TLS | +| `9009` | interserver | +| `9001` | management | | `9363` | Prometheus metrics | The following names are also rejected — they are the operator's internal protocol-type identifiers (not the human-readable aliases): -| Name | -|---| -| `http` | +| Name | +|---------------| +| `http` | | `http-secure` | -| `tcp` | -| `tcp-secure` | +| `tcp` | +| `tcp-secure` | | `interserver` | -| `management` | -| `prometheus` | +| `management` | +| `prometheus` | A rejected request produces an error such as: @@ -496,12 +557,14 @@ spec.additionalPorts[0].name: "http" is reserved by the operator ``` ## Version probe and upgrade channel {#version-probe-and-upgrade-channel} + The operator does two independent things with cluster versions: -1. **Version probe** — a Kubernetes `Job` that runs the container image once to detect the running ClickHouse / Keeper version. The detected version is recorded in `.status.version` and used by other reconciliation steps (e.g. the `External Secret` named-collections key is only required from ClickHouse `25.12`). +1. **Version reporting** — for `ClickHouseCluster`, a Kubernetes `Job` runs the container image once to detect the running ClickHouse version; for `KeeperCluster`, the operator reads the server-reported version from running replicas. The detected version is recorded in `.status.version` and used by other reconciliation steps (e.g. the `External Secret` named-collections key is only required from ClickHouse `25.12`). 2. **Upgrade channel** — a periodic check against the public ClickHouse release feed (`https://clickhouse.com/data/version_date.tsv`). The operator reports whether a newer version is available via the `VersionUpgraded` status condition. It never upgrades the cluster on its own — the user is in control of the image tag. ### Choosing a release channel {#upgrade-channel-choosing} + `spec.upgradeChannel` selects which set of upstream releases the operator compares against. Same field exists on both `ClickHouseCluster` and `KeeperCluster`. ```yaml @@ -511,30 +574,31 @@ spec: Allowed values (validated by the CRD with the pattern `^(lts|stable|\d+\.\d+)?$`): -| Value | Behavior | -|---|---| -| _empty_ (default) | The operator proposes only **minor** updates within the currently-running major.minor line. A cluster on `25.8.3.1` will be told about `25.8.4.x` but not `25.9.x`. | -| `stable` | Tracks the upstream `stable` channel — the latest release that ClickHouse Inc. flags as stable on the main release line. Receives major upgrades sooner than the `lts` channel. | -| `lts` | Tracks the upstream `lts` channel — long-term support releases. Receives major upgrades less frequently, with longer support windows. | -| `25.8` (or any `.`) | Pins the channel to a specific major.minor line. Major upgrades beyond it are not proposed even if a newer version exists upstream. | +| Value | Behavior | +|-----------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| _empty_ (default) | The operator proposes only **minor** updates within the currently-running major.minor line. A cluster on `25.8.3.1` will be told about `25.8.4.x` but not `25.9.x`. | +| `stable` | Tracks the upstream `stable` channel — the latest release that ClickHouse Inc. flags as stable on the main release line. Receives major upgrades sooner than the `lts` channel. | +| `lts` | Tracks the upstream `lts` channel — long-term support releases. Receives major upgrades less frequently, with longer support windows. | +| `25.8` (or any `.`) | Pins the channel to a specific major.minor line. Major upgrades beyond it are not proposed even if a newer version exists upstream. | For production, pinning the channel to an explicit `.` (e.g. `25.8`) is generally preferred. It locks the cluster to the intended major release line and lets the operator surface a `WrongReleaseChannel` warning if any replica somehow drifts onto a different major — which matters especially when the image is referenced by a digest (`@sha256:...`) rather than by a human-readable tag. The empty default is fine for development clusters where major-version jumps are not a concern. ### Status conditions {#version-status-conditions} + Two conditions surface the result of the probe and the upgrade check: -| Condition | Reason | Meaning | -|---|---|---| -| `VersionInSync` | `VersionMatch` | All replicas report the same version as the image | -| `VersionInSync` | `VersionMismatch` | Replicas are running different versions. The warning event is suppressed during a planned rolling upgrade. It typically surfaces when a mutable image tag has been pinned (for example `latest` or a bare major like `26.3`) and the underlying registry has shifted between pulls, so different replicas ended up on different patches of the same tag. | -| `VersionInSync` | `VersionPending` | Version probe Job has not finished yet | -| `VersionInSync` | `VersionProbeFailed` | Probe Job failed; the operator cannot determine the running version | -| `VersionUpgraded` | `UpToDate` | The cluster is on the latest version available in the selected channel | -| `VersionUpgraded` | `MinorUpdateAvailable` | A newer patch is available in the same `major.minor` line | -| `VersionUpgraded` | `MajorUpdateAvailable` | A newer `major.minor` is available within the chosen channel | -| `VersionUpgraded` | `VersionOutdated` | The running version is out of date and will no longer receive fixes from the selected channel — typically because the major line has been dropped from `lts` or `stable` upstream | -| `VersionUpgraded` | `WrongReleaseChannel` | The running image does not belong to the selected `upgradeChannel`. Example: a cluster running `26.5` with `upgradeChannel: lts`, since `26.5` is not part of the upstream `lts` line. | -| `VersionUpgraded` | `UpgradeCheckFailed` | The operator could not reach the upstream release feed | +| Condition | Reason | Meaning | +|-------------------|------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `VersionInSync` | `VersionMatch` | All replicas report the same version | +| `VersionInSync` | `VersionMismatch` | Replicas are running different versions. This reason is suppressed during a planned rolling upgrade. It typically surfaces when a mutable image tag has been pinned (for example `latest` or a bare major like `26.3`) and the underlying registry has shifted between pulls, so different replicas ended up on different patches of the same tag. | +| `VersionInSync` | `VersionPending` | Version probe Job has not finished yet, or no Keeper replica version has been observed yet | +| `VersionInSync` | `VersionProbeFailed` | ClickHouse probe Job failed; the operator cannot determine the running version | +| `VersionUpgraded` | `UpToDate` | The cluster is on the latest version available in the selected channel | +| `VersionUpgraded` | `MinorUpdateAvailable` | A newer patch is available in the same `major.minor` line | +| `VersionUpgraded` | `MajorUpdateAvailable` | A newer `major.minor` is available within the chosen channel | +| `VersionUpgraded` | `VersionOutdated` | The running version is out of date and will no longer receive fixes from the selected channel — typically because the major line has been dropped from `lts` or `stable` upstream | +| `VersionUpgraded` | `WrongReleaseChannel` | The running image does not belong to the selected `upgradeChannel`. Example: a cluster running `26.5` with `upgradeChannel: lts`, since `26.5` is not part of the upstream `lts` line. | +| `VersionUpgraded` | `UpgradeCheckFailed` | The operator could not reach the upstream release feed | Inspect them with: @@ -543,6 +607,9 @@ kubectl get clickhousecluster sample -o yaml | sed -n '/conditions:/,/^[^ ]/p' ``` ### Overriding the version probe Job {#version-probe-template} + +This applies to `ClickHouseCluster` only. `KeeperCluster` no longer runs a version-probe Job — its version is read directly from the running Keeper replicas — so `spec.versionProbeTemplate` is deprecated and has no effect there. + The probe is implemented as a regular Kubernetes `Job`. If your cluster has admission policies that require specific Tolerations, node selectors, security contexts, or you want to limit how long completed probe Jobs linger, override the template via `spec.versionProbeTemplate`: ```yaml @@ -570,41 +637,97 @@ spec: The container name `version-probe` is the operator's default — the entry under `containers:` matches it by name, so the operator deep-merges the user-provided fields on top of the defaults. ### Operator-wide controls {#version-operator-flags} + Two flags on the operator manager control the upgrade-check loop globally: -| Flag | Default | Effect | -|---|---|---| -| `--version-update-interval` | `24h` | How often the operator re-fetches the upstream version list | +| Flag | Default | Effect | +|-----------------------------------|---------|--------------------------------------------------------------------------------------------------------------------------------------------------| +| `--version-update-interval` | `24h` | How often the operator re-fetches the upstream version list | | `--disable-version-update-checks` | `false` | Disables the upgrade checker entirely. The `VersionUpgraded` condition is not set, and no outbound HTTP traffic to `clickhouse.com` is generated | Set `--disable-version-update-checks=true` in air-gapped environments or when egress to `clickhouse.com` is not allowed. ## ClickHouse settings {#clickhouse-settings} + ### Default user password {#default-user-password} -Set the default user password: + +`spec.settings.defaultUserPassword` sets the password for the built-in `default` +user. Provide the value from a key in a Secret (recommended) or a ConfigMap that +you create, rather than inline in the CR: + +```yaml +spec: + settings: + defaultUserPassword: + passwordType: password # default; see "Password types" below + secret: # exactly one of secret or configMap + name: clickhouse-password # name of the Secret/ConfigMap + key: password # the key inside it, not the password value +``` + +Provide exactly one of `secret` or `configMap`, each with both `name` (the object) +and `key` (the entry that holds the password). + +#### Password types {#password-types} + +`passwordType` tells ClickHouse how to interpret the value. It defaults to +`password` (plaintext); the alternatives are hashed forms such as +`password_sha256_hex` and `password_double_sha1_hex`. Prefer a hashed type so the +plaintext is never stored. See the +[ClickHouse user settings](https://clickhouse.com/docs/operations/settings/settings-users#user-namepassword) +for the full list. + +#### Full example with a Secret {#default-password-secret-example} + +Create the Secret, then reference its key: + +```bash +kubectl create secret generic clickhouse-password \ + --from-literal=password='your-secure-password' +``` ```yaml +apiVersion: clickhouse.com/v1alpha1 +kind: ClickHouseCluster +metadata: + name: my-cluster spec: settings: defaultUserPassword: - passwordType: # Default: password - : - name: - key: + passwordType: password + secret: + name: clickhouse-password + key: password ``` -It isn't recommended to use ConfigMap to store plain text passwords. +With `passwordType: password`, the in-pod `clickhouse-client` is configured with +this password, which is handy for debugging. -Create the secret: +For a hashed password, store the hash instead of the plaintext: ```bash -kubectl create secret generic clickhouse-password --from-literal=password='your-secure-password' +echo -n 'your-secure-password' | sha256sum # use the hex digest as the value +kubectl create secret generic clickhouse-password \ + --from-literal=password='' +``` + +```yaml +spec: + settings: + defaultUserPassword: + passwordType: password_sha256_hex + secret: + name: clickhouse-password + key: password ``` -#### Using ConfigMap for user passwords {#using-configmap-for-user-passwords} -You can also use ConfigMap for non-sensitive default passwords: +#### Using a ConfigMap {#using-configmap-for-user-passwords} + +A ConfigMap works the same way, but its contents are not protected like a Secret. +Use it only for non-sensitive or already-hashed values, such as a +`password_sha256_hex` digest: ```yaml spec: @@ -616,7 +739,13 @@ spec: key: default_password ``` + +Do not put a plaintext password in a ConfigMap. Use a Secret for any plaintext +(`passwordType: password`) value. + + ### Custom users in configuration {#custom-users-in-configuration} + Configure additional users in configuration files. Create a ConfigMap and Secret for user: @@ -668,6 +797,7 @@ spec: ``` ### Database sync {#database-sync} + Enable automatic database synchronization for new replicas: ```yaml @@ -679,6 +809,7 @@ spec: When enabled, the operator synchronizes Replicated and integration tables to new replicas. ### Server logging {#server-logging} + Configure the ClickHouse server log through `spec.settings.logger`. Every field is optional with a safe default, so a cluster you never touch already logs at `trace` to both the container console and a rotated file on disk. ```yaml @@ -692,13 +823,13 @@ spec: count: 50 # Default: 50. Number of rotated files to keep ``` -| Field | Default | Description | -|---|---|---| -| `logToFile` | `true` | When `false`, the operator drops the file targets and the server logs only to the container console. | -| `jsonLogs` | `false` | When `true`, the operator adds `formatting.type: json` so each line is a JSON object. | -| `level` | `trace` | Log verbosity. One of `test`, `trace`, `debug`, `information`, `notice`, `warning`, `error`, `critical`, `fatal`. | -| `size` | `1000M` | Maximum size of a single log file before rotation. | -| `count` | `50` | Number of rotated log files the server retains. | +| Field | Default | Description | +|-------------|---------|-------------------------------------------------------------------------------------------------------------------| +| `logToFile` | `true` | When `false`, the operator drops the file targets and the server logs only to the container console. | +| `jsonLogs` | `false` | When `true`, the operator adds `formatting.type: json` so each line is a JSON object. | +| `level` | `trace` | Log verbosity. One of `test`, `trace`, `debug`, `information`, `notice`, `warning`, `error`, `critical`, `fatal`. | +| `size` | `1000M` | Maximum size of a single log file before rotation. | +| `count` | `50` | Number of rotated log files the server retains. | The operator always keeps console logging on so that `kubectl logs` works, and layers file logging on top when `logToFile` is `true`. A cluster with the defaults renders this `logger` block: @@ -719,7 +850,9 @@ Console logging stays on regardless of `logToFile`, so `kubectl logs` keeps work ## Custom configuration {#custom-configuration} + ### Embedded extra configuration {#embedded-extra-configuration} + Instead of mounting custom configuration files, you can directly specify additional ClickHouse configuration options. Add custom ClickHouse configuration using `extraConfig`: @@ -736,6 +869,7 @@ spec: * [All server settings](/reference/settings/server-settings/settings) ### Embedded extra users configuration {#embedded-extra-users-configuration} + You can also specify additional ClickHouse users configuration using `extraUsersConfig`. This is useful for defining users, profiles, quotas, and grants directly in the cluster specification. ```yaml @@ -767,6 +901,7 @@ The `extraUsersConfig` is stored in k8s ConfigMap object. Avoid plain text secre #### See [documentation](/concepts/features/configuration/settings/settings-users) for all supported ClickHouse users configuration options. ### Configuration example {#configuration-example} + Complete configuration example: ```yaml diff --git a/products/kubernetes-operator/guides/introduction.mdx b/products/kubernetes-operator/guides/introduction.mdx index 5e647a80..547e73ee 100644 --- a/products/kubernetes-operator/guides/introduction.mdx +++ b/products/kubernetes-operator/guides/introduction.mdx @@ -10,6 +10,7 @@ doc_type: 'guide' This document provides an overview of key concepts and usage patterns for the ClickHouse Operator. ## What is the ClickHouse Operator {#what-is-the-clickhouse-operator} + The ClickHouse Operator is a Kubernetes operator that automates the deployment and management of ClickHouse clusters on Kubernetes. Built using the operator pattern, it extends the Kubernetes API with custom resources that represent ClickHouse clusters and their dependencies. The operator handles: @@ -21,9 +22,11 @@ The operator handles: - Storage provisioning ## Custom resources {#custom-resources} + The operator provides two main custom resource definitions (CRDs): ### ClickHouseCluster {#clickhousecluster} + Represents a ClickHouse database cluster with configurable replicas and shards. ```yaml @@ -43,6 +46,7 @@ spec: ``` ### KeeperCluster {#keepercluster} + Represents a ClickHouse Keeper cluster for distributed coordination (ZooKeeper replacement). ```yaml @@ -59,11 +63,14 @@ spec: ``` ## Coordination {#coordination} + ### ClickHouse Keeper is required {#clickhouse-keeper-is-required} + Every ClickHouseCluster requires a ClickHouse Keeper cluster for distributed coordination. The Keeper cluster must be referenced in the ClickHouseCluster spec using `keeperClusterRef`. By default the operator looks in the ClickHouseCluster namespace, but you can also set `keeperClusterRef.namespace` to point at a KeeperCluster in another watched namespace. ### One-to-One Keeper relationship {#one-to-one-keeper-relationship} + Each ClickHouseCluster must have its own dedicated KeeperCluster. You can't share a single KeeperCluster between multiple ClickHouseClusters. **Why?** The operator automatically generates a unique authentication key for each ClickHouseCluster to access its Keeper. This key is stored in a Secret and can't be shared. @@ -86,9 +93,11 @@ When recreating a cluster: To avoid authentication errors, either delete the Persistent Volumes manually or recreate both clusters together with fresh storage. ## Schema Replication {#schema-replication} + The ClickHouse Operator automatically replicates database definitions across all replicas in a cluster. ### What Gets Replicated {#what-gets-replicated} + The operator synchronizes: - [Replicated](/reference/engines/database-engines/replicated) database definitions - Integration database engines (PostgreSQL, MySQL, etc.) @@ -99,6 +108,7 @@ The operator does **not** synchronize: - Table data (handled by ClickHouse replication) ### Recommended: Use Replicated database engine {#recommended-use-replicated-database-engine} + **Best practice** @@ -118,18 +128,22 @@ CREATE DATABASE my_database ON CLUSTER 'default' ENGINE = Replicated; ``` ### Avoid non-Replicated engines {#avoid-non-replicated-engines} + Non-replicated database engines (Atomic, Lazy, SQLite, Ordinary) require manual schema management: - Tables must be created individually on each replica - Schema drift can occur between nodes - Operator can't automatically sync new replicas ### Disable schema replication {#disable-schema-replication} + To disable automatic schema replication, set `spec.settings.enableDatabaseSync` to `false` in the ClickHouseCluster resource. ## Storage management {#storage-management} + The operator manages storage through Kubernetes PersistentVolumeClaims (PVCs). ### Data volume configuration {#data-volume-configuration} + Specify storage requirements in `dataVolumeClaimSpec`: ```yaml @@ -142,6 +156,7 @@ spec: ``` ### Storage Lifecycle {#storage-lifecycle} + - **Creation**: PVCs are created automatically with the cluster - **Expansion**: Supported if StorageClass allows volume expansion - **Retention**: PVCs are **not** deleted automatically on cluster deletion @@ -161,6 +176,7 @@ kubectl delete pvc -l app.kubernetes.io/instance=my-cluster-clickhouse ``` ## Default configuration highlights {#default-configuration-highlights} + * **Pre-configured Cluster:** Cluster named 'default' containing all ClickHouse nodes. * **Default macros:** Some useful macros are pre-defined: - `{cluster}`: Cluster name (`default`) @@ -170,5 +186,6 @@ kubectl delete pvc -l app.kubernetes.io/instance=my-cluster-clickhouse * **Replicated storage for User Defined Functions(UDF)** ## Next steps {#next-steps} + - [Configuration Guide](/products/kubernetes-operator/guides/configuration) - Detailed configuration options - [API Reference](/products/kubernetes-operator/reference/api-reference) - Complete API documentation diff --git a/products/kubernetes-operator/guides/monitoring.mdx b/products/kubernetes-operator/guides/monitoring.mdx index 736bd1ff..6dac1dab 100644 --- a/products/kubernetes-operator/guides/monitoring.mdx +++ b/products/kubernetes-operator/guides/monitoring.mdx @@ -1,5 +1,5 @@ --- -position: 3 +position: 4 slug: /clickhouse-operator/guides/monitoring title: Monitoring the ClickHouse Operator keywords: ['kubernetes', 'prometheus', 'monitoring', 'metrics'] @@ -16,6 +16,7 @@ This guide is about the **operator process itself** (the controller manager). Fo ## Endpoints {#endpoints} + The operator process exposes two HTTP endpoints inside the manager pod: | Endpoint | Default port | Path | Purpose | @@ -28,6 +29,7 @@ The metrics endpoint is **off by default** when running the operator binary dire The health probe endpoint is always on; the deployment template wires `/healthz` and `/readyz` to the pod's liveness and readiness probes on port `8081`. ## Operator binary flags {#operator-binary-flags} + The relevant `manager` flags (defined in [`cmd/main.go`](https://github.com/ClickHouse/clickhouse-operator/blob/main/cmd/main.go)): | Flag | Default | Description | @@ -46,6 +48,7 @@ The `8443` (HTTPS) / `8080` (HTTP) convention in the flag's help text is only a ## Enable metrics via Helm {#enable-metrics-via-helm} + The chart already creates a `Service` for the metrics port and, optionally, a `ServiceMonitor` for prometheus-operator. The metrics endpoint itself is on by default (`metrics.enable: true`, port `8080`, served over HTTPS via `metrics.secure: true`). The only setting you typically need to flip is `prometheus.enable` to have the chart create a `ServiceMonitor` for you: @@ -90,6 +93,7 @@ After install the chart creates: - `ClusterRole/-metrics-reader` — non-resource URL `/metrics` with `get` verb. ## Securing the metrics endpoint {#securing-the-metrics-endpoint} + When `metrics.secure: true` the metrics server enforces TLS **and** Kubernetes authentication/authorization on every scrape. Scrapers must: 1. Present a valid Kubernetes bearer token. @@ -131,6 +135,7 @@ If you see `401 Unauthorized` or `403 Forbidden` from the metrics endpoint, the ## ServiceMonitor reference {#servicemonitor-reference} + The chart renders a ServiceMonitor of this shape when `prometheus.enable: true`: ```yaml @@ -168,9 +173,11 @@ spec: If your Prometheus instance does not run cert-manager, set `tlsConfig.insecureSkipVerify: true` and rely on bearer-token authentication only — the chart already does this when `certManager.enable: false`. ## Standalone Prometheus example {#standalone-prometheus-example} + If you do not use kube-prometheus-stack, the repository ships a self-contained example at [`examples/prometheus_secure_metrics_scraper.yaml`](https://github.com/ClickHouse/clickhouse-operator/blob/main/examples/prometheus_secure_metrics_scraper.yaml). It creates a ServiceAccount, the necessary RBAC, and a `Prometheus` CR that selects the operator's ServiceMonitor. ## Health probe endpoints {#health-probe-endpoints} + | Path | Used by | Returns | |---|---|---| | `/healthz` | Kubernetes liveness probe | `200 OK` as long as the probe server is listening. | @@ -198,9 +205,11 @@ readinessProbe: A repeatedly failing probe usually means the probe server itself never came up — for example, the manager exited early during startup. Check the manager logs for `unable to start manager`, RBAC failures, or `cache did not sync` errors. ## Metrics catalog {#metrics-catalog} + The operator does not register custom Prometheus collectors. Everything below is exposed by the underlying `controller-runtime` and `client-go` libraries. The most useful series, grouped by purpose: ### Reconciliation activity {#reconciliation-activity} + | Metric | Type | Labels | |---|---|---| | `controller_runtime_reconcile_total` | counter | `controller`, `result` (`success` / `error` / `requeue` / `requeue_after`) | @@ -212,6 +221,7 @@ The operator does not register custom Prometheus collectors. Everything below is The `controller` label is derived by `controller-runtime` from the resource type registered with `For(...)`. With the current code in `internal/controller/clickhouse` and `internal/controller/keeper` this resolves to `clickhousecluster` and `keepercluster` respectively. If you have customized the operator, verify with a one-time scrape of `/metrics`. ### Work queue {#work-queue} + | Metric | Type | Labels | |---|---|---| | `workqueue_depth` | gauge | `name`, `controller`, `priority` | @@ -225,11 +235,13 @@ The `controller` label is derived by `controller-runtime` from the resource type The `name` and `controller` labels carry the same value (the controller name). ### API server traffic {#api-server-traffic} + | Metric | Type | Labels | |---|---|---| | `rest_client_requests_total` | counter | `code`, `method`, `host` | ### Leader election {#leader-election} + | Metric | Type | Labels | |---|---|---| | `leader_election_master_status` | gauge | `name` (= `d4ceba06.clickhouse.com`) | @@ -237,9 +249,11 @@ The `name` and `controller` labels carry the same value (the controller name). The Helm chart enables `--leader-elect` by default, so this metric is present in standard Helm installs. When running the binary directly without the flag, the metric is absent. ### Runtime {#runtime} + Standard Go process and runtime collectors — `go_goroutines`, `go_memstats_*`, `process_cpu_seconds_total`, `process_resident_memory_bytes`, etc. ## Useful PromQL queries {#useful-promql-queries} + ### Health overview ```promql @@ -282,6 +296,7 @@ sum(leader_election_master_status{name="d4ceba06.clickhouse.com"}) ``` ## Suggested alerts {#suggested-alerts} + Starting point for a PrometheusRule (tune thresholds for your environment): ```yaml @@ -330,6 +345,7 @@ groups: The last rule is only meaningful when leader election is enabled. ## Verifying the setup {#verifying-the-setup} + A quick end-to-end check, assuming the chart was installed in `clickhouse-operator-system`: ```bash @@ -357,5 +373,6 @@ kubectl -n $NS run curl-metrics --rm -it --restart=Never \ If the scrape returns metrics in the Prometheus exposition format, the endpoint and RBAC are correctly wired. ## Related guides {#related-guides} + - [Installation](/products/kubernetes-operator/install/helm) — Helm values relevant to monitoring. - [Configuration](/products/kubernetes-operator/guides/configuration) — TLS configuration shared with the metrics server. diff --git a/products/kubernetes-operator/guides/scaling.mdx b/products/kubernetes-operator/guides/scaling.mdx index 052a020e..f3066dae 100644 --- a/products/kubernetes-operator/guides/scaling.mdx +++ b/products/kubernetes-operator/guides/scaling.mdx @@ -1,5 +1,5 @@ --- -position: 4 +position: 5 slug: /clickhouse-operator/guides/scaling title: Scaling clusters keywords: ['kubernetes', 'scaling', 'replicas', 'shards', 'keeper', 'quorum'] @@ -16,6 +16,7 @@ A `ClickHouseCluster` always needs a Keeper, referenced through the required `sp ## Scaling replicas {#scaling-replicas} + `spec.replicas` sets the number of replicas in every shard. Each replica runs in its own StatefulSet named `-clickhouse--`, so a cluster with `shards: 2` and `replicas: 3` runs six StatefulSets. Raise or lower the count in place: @@ -30,6 +31,7 @@ spec: On scale up the operator creates the new per-replica StatefulSets, waits for each pod to become ready, and then synchronizes the schema to the new replicas (see [Automatic schema sync](#automatic-schema-sync)). On scale down it removes the surplus StatefulSets and cleans up the stale replicated-database replica registrations the removed replicas left behind. ## Scaling shards {#scaling-shards} + `spec.shards` sets the number of shards. Each new shard adds a full set of per-replica StatefulSets, and the operator creates one [PodDisruptionBudget per shard](/products/kubernetes-operator/guides/configuration#pod-disruption-budgets) so a disruption in one shard cannot count against another. ```yaml @@ -41,6 +43,7 @@ spec: Each shard holds a distinct slice of the data, and the operator does not copy or move rows between shards. A `Distributed` table or an explicit routing scheme decides which shard a row lands on, so adding a shard gives new writes somewhere to land without touching the rows already stored in the existing shards. ## Automatic schema sync {#automatic-schema-sync} + When `spec.settings.enableDatabaseSync` is `true` (the default), the operator keeps the schema aligned as the topology changes: - **On scale up** — once at least two replicas are ready, the operator replicates the database definitions to the newly created replicas, so a fresh replica joins with the same `Replicated` and integration databases as the rest of the cluster. @@ -51,6 +54,7 @@ This covers `Replicated` databases and integration database engines. It does not Set `enableDatabaseSync: false` to turn the behavior off, for example when an external tool owns schema propagation. The operator then reports the `SchemaSyncDisabled` reason on the `SchemaInSync` condition. ## Conditions to watch {#scaling-conditions} + Inspect progress on the Custom Resource while a scale operation runs: ```bash @@ -72,6 +76,7 @@ kubectl get clickhousecluster sample -o yaml | sed -n '/conditions:/,/^[^ ]/p' A scale operation is complete when `ClusterSizeAligned` reports `UpToDate`, `SchemaInSync` reports `ReplicasInSync`, and `Ready` reports `AllShardsReady`. ## Scaling Keeper {#scaling-keeper} + A `KeeperCluster` runs a RAFT quorum, so the operator changes its membership **one replica at a time** and only while the cluster is in a stable state. This protects the quorum: a `2F+1` cluster tolerates `F` members down, so a 3-node cluster keeps working with one member missing and a 5-node cluster with two. ```yaml diff --git a/products/kubernetes-operator/guides/storage.mdx b/products/kubernetes-operator/guides/storage.mdx new file mode 100644 index 00000000..f70acf6c --- /dev/null +++ b/products/kubernetes-operator/guides/storage.mdx @@ -0,0 +1,175 @@ +--- +position: 3 +slug: /clickhouse-operator/guides/storage +title: Storage and volumes +keywords: ['kubernetes', 'storage', 'volumes', 'pvc', 'jbod', 'disks'] +description: 'How the operator provisions persistent storage for ClickHouse clusters, including the primary data volume, multi-disk (JBOD) layouts, expansion, and what cannot change after creation.' +doc_type: 'guide' +--- + +This guide covers how the operator provisions persistent storage for a +`ClickHouseCluster`: the primary data volume, attaching extra disks in a +multi-disk (JBOD) layout, expanding capacity, and the rules that govern what you +can and cannot change after a cluster exists. + +For the field-by-field reference, see +[Configuration → Storage configuration](/products/kubernetes-operator/guides/configuration#storage-configuration) +and the [API Reference](/products/kubernetes-operator/reference/api-reference). + +## Primary data volume {#primary-data-volume} + +`spec.dataVolumeClaimSpec` is a standard Kubernetes `PersistentVolumeClaimSpec`. +The operator turns it into a StatefulSet `volumeClaimTemplate`, so the StatefulSet +controller creates and retains one PersistentVolumeClaim per replica and mounts it +at the ClickHouse data path `/var/lib/clickhouse`. + +```yaml +apiVersion: clickhouse.com/v1alpha1 +kind: ClickHouseCluster +metadata: + name: my-cluster +spec: + dataVolumeClaimSpec: + storageClassName: fast-ssd # optional; depends on the installed CSI driver + resources: + requests: + storage: 100Gi +``` + +- When `accessModes` is omitted, the operator defaults it to `ReadWriteOnce`. +- The per-replica PVC is retained when the cluster is deleted, so data survives a + delete-and-recreate of the Custom Resource. +- The same field exists on `KeeperCluster` and behaves the same way. + +## Running without a persistent data volume {#ephemeral-storage} + +`dataVolumeClaimSpec` is optional. If you omit it and do not mount your own volume +at the data path, ClickHouse writes to the container's ephemeral filesystem and the +admission webhook returns a warning that data may be lost if the cluster is restarted. + +This is intended only for throwaway or test clusters. To supply your own storage +instead of `dataVolumeClaimSpec` — for example an `emptyDir` or a pre-provisioned +volume — define it through `spec.podTemplate.volumes` and mount it at +`/var/lib/clickhouse` with `spec.containerTemplate.volumeMounts`. + + +`dataVolumeClaimSpec` and a custom volume at the data path are mutually exclusive. +If `dataVolumeClaimSpec` is set, mounting a custom volume at `/var/lib/clickhouse` +is rejected. The reserved volume names `clickhouse-storage-volume`, +`clickhouse-server-tls-volume`, and `clickhouse-server-custom-ca-volume` cannot be +used in `podTemplate.volumes`. + + +## Expanding storage {#expanding-storage} + +To grow a volume, increase `resources.requests.storage` and apply the change. The +operator updates the existing PVCs in place. + +```yaml +spec: + dataVolumeClaimSpec: + resources: + requests: + storage: 200Gi # was 100Gi +``` + + +Expansion only works when the underlying StorageClass has +`allowVolumeExpansion: true`. Kubernetes does not support shrinking a PVC, so the +new size must be greater than or equal to the current size. + + +## Multi-disk (JBOD) storage {#multi-disk-jbod} + +`spec.additionalVolumeClaimTemplates` attaches extra disks to each ClickHouse +replica on top of the primary `dataVolumeClaimSpec`. Each entry is a named PVC +template — a `metadata.name` plus a PVC `spec` — reconciled exactly like the +primary data disk, so the StatefulSet controller creates and retains one PVC per +replica named `--0`. + +```yaml +spec: + dataVolumeClaimSpec: + storageClassName: fast-ssd + resources: + requests: + storage: 100Gi + additionalVolumeClaimTemplates: + - metadata: + name: disk1 + spec: + storageClassName: fast-ssd + resources: + requests: + storage: 100Gi + - metadata: + name: disk2 + spec: + storageClassName: fast-ssd + resources: + requests: + storage: 100Gi +``` + +The operator mounts each additional volume at `/var/lib/clickhouse/disks/` +and **generates the ClickHouse `storage_configuration` for you** — you do not write +it by hand. It registers every additional disk and adds it to the built-in `default` +storage policy. + +The primary data disk (`default`) and every additional disk share a single volume +of the `default` policy, so ClickHouse spreads new data parts across all of them in +round-robin fashion. Usable capacity is the sum of all disks, and every table that +does not set its own `storage_policy` — including `system.*` tables — uses the +combined set. + + +The mount path keeps the template name verbatim, but the disk identifier inside +`storage_configuration` replaces hyphens with underscores. A template named +`cold-disk` is mounted at `/var/lib/clickhouse/disks/cold-disk` and appears as +`cold_disk` in the generated configuration. + + +## Custom storage policies {#custom-storage-policies} + +You do **not** need `extraConfig` for the JBOD layout above — the operator generates +the `default` policy automatically. Reach for `spec.settings.extraConfig` only when +you want storage policies *beyond* the generated default, for example a tiered +hot/cold policy with `move_factor` and `prefer_not_to_merge`, or an S3-backed disk. +Configuration you add there is merged on top of the generated `storage_configuration`. + +See the +[ClickHouse storage documentation](https://clickhouse.com/docs/engines/table-engines/mergetree-family/mergetree#table_engine-mergetree-multiple-volumes) +for the policy fields. + +## What you cannot change after creation {#immutability} + +Storage layout is largely fixed once a cluster exists. The admission webhook rejects +updates that would orphan or rebind PersistentVolumeClaims: + +- The presence of `dataVolumeClaimSpec` is immutable — you cannot **add** a data + volume to a cluster created without one, nor **remove** it from a cluster created + with one. +- The set of `additionalVolumeClaimTemplates` is fixed — you cannot **add**, + **remove**, or **rename** entries after creation. +- Expanding `resources.requests.storage` on an existing entry **is** allowed (subject + to StorageClass support, see [Expanding storage](#expanding-storage)). + +## Validation reference {#validation-reference} + +| Condition | Result | +|---|---| +| No `dataVolumeClaimSpec` and no custom volume at `/var/lib/clickhouse` | Warning — possible data loss on restart | +| Custom volume mounted at `/var/lib/clickhouse` while `dataVolumeClaimSpec` is set | Rejected | +| `additionalVolumeClaimTemplates` set but `dataVolumeClaimSpec` missing | Rejected | +| Additional disk named `default` | Rejected — reserved by the ClickHouse default disk | +| Additional disk named `clickhouse-storage-volume` | Rejected — collides with the primary data volume name | +| Duplicate additional disk name | Rejected | +| Name not matching `^[a-z]([-a-z0-9]*[a-z0-9])?$` or longer than 63 characters | Rejected by the CRD schema | +| Adding or removing `dataVolumeClaimSpec` after creation | Rejected | +| Adding, removing, or renaming `additionalVolumeClaimTemplates` after creation | Rejected | +| Reserved volume name in `podTemplate.volumes` | Rejected | + +## Related guides {#related-guides} + +- [Configuration](/products/kubernetes-operator/guides/configuration) — the full field reference, including `extraConfig`. +- [Scaling clusters](/products/kubernetes-operator/guides/scaling) — how replicas and shards are added and removed. diff --git a/products/kubernetes-operator/guides/tls.mdx b/products/kubernetes-operator/guides/tls.mdx index 16579f7a..89a04ea4 100644 --- a/products/kubernetes-operator/guides/tls.mdx +++ b/products/kubernetes-operator/guides/tls.mdx @@ -1,5 +1,5 @@ --- -position: 5 +position: 6 slug: /clickhouse-operator/guides/tls title: Securing a cluster with TLS keywords: ['kubernetes', 'tls', 'ssl', 'cert-manager', 'security', 'certificates'] @@ -17,6 +17,7 @@ It is task oriented. For the field-by-field reference of `spec.settings.tls`, se and the [API Reference](/products/kubernetes-operator/reference/api-reference#clustertlsspec). ## Prerequisites {#prerequisites} + - A running ClickHouse cluster managed by the operator (see [Introduction](/products/kubernetes-operator/guides/introduction)). - [cert-manager](https://cert-manager.io/docs/installation/) installed in the cluster. - `kubectl` access to the cluster's namespace. @@ -26,18 +27,18 @@ The operator does not generate certificates itself — it consumes a Kubernetes rotate that Secret, but any tool that writes a Secret in the expected format works. ## How the operator expects certificates {#secret-format} + TLS is enabled by pointing `spec.settings.tls.serverCertSecret` at a Secret that -contains three keys: +contains the server keypair: -| Secret key | Contents | -|------------|----------------------------------| -| `tls.crt` | PEM-encoded server certificate | -| `tls.key` | PEM-encoded private key | -| `ca.crt` | PEM-encoded CA certificate chain | +| Secret key | Contents | Required | +|------------|--------------------------------|----------| +| `tls.crt` | PEM-encoded server certificate | Yes | +| `tls.key` | PEM-encoded private key | Yes | This is exactly the layout cert-manager writes for a `Certificate` resource, so no -conversion is needed. The operator mounts these into each pod at -`/etc/clickhouse-server/tls/` and wires them into ClickHouse's `openSSL` configuration. +conversion is needed. The operator mounts the keypair into each pod at +`/etc/clickhouse-server/tls/` and wires it into ClickHouse's `openSSL` configuration. `serverCertSecret` is **mandatory** when `tls.enabled: true`. The validating @@ -46,6 +47,7 @@ unless `enabled: true`. ## Step 1 — Bootstrap a CA with cert-manager {#step-1-ca} + The most reproducible setup is a self-signed CA that then signs the server certificate. This gives you a stable `ca.crt` that clients can trust. @@ -92,6 +94,7 @@ corporate CA, Vault, ACME, etc.). Only Step 2 changes — the cluster wiring is identical. ## Step 2 — Issue the server certificate {#step-2-cert} + Request a leaf certificate from the CA issuer. The `dnsNames` must cover how clients address the pods. The operator creates a single **headless** Service named `-clickhouse-headless`, and each replica pod is addressable at @@ -132,6 +135,7 @@ kubectl -n get secret clickhouse-cert -o jsonpath='{.data}' | jq 'ke ``` ## Step 3 — Enable TLS on the cluster {#step-3-enable} + Point the cluster at the Secret: ```yaml @@ -150,6 +154,7 @@ spec: ``` ### What the operator does {#what-the-operator-does} + When `tls.enabled: true`, the operator: - **Opens the secure ports** on every pod and the headless Service: `9440` @@ -174,6 +179,7 @@ even when TLS is off, so toggling `tls.enabled` later never collides with a ## Step 4 — Connect over TLS {#step-4-connect} + With `required: true`, clients must use the secure ports and trust the CA. Address a specific replica pod through the headless Service (or your own `ClusterIP` Service if you created one). @@ -203,6 +209,7 @@ kubectl -n get secret clickhouse-cert \ ``` ## Encrypting Keeper traffic {#keeper-tls} + Enabling TLS on the ClickHouse cluster does **not** encrypt the link to Keeper. Enable it on the `KeeperCluster` independently — issue a certificate for the Keeper service (Steps 1–2 with the Keeper service `dnsNames`) and reference it: @@ -224,12 +231,15 @@ spec: Keeper exposes its secure client port on `2281`. Once Keeper has TLS enabled, **the ClickHouse cluster connects to it over TLS automatically** — no extra setting on the -ClickHouseCluster side. ClickHouse verifies the Keeper certificate using its own -`ca.crt` bundle when it has TLS enabled, otherwise the system default CA bundle. +ClickHouseCluster side. ClickHouse verifies the Keeper certificate against the system +trust store, plus any [`caBundle`](#custom-ca) you configure. ## Custom CA bundle {#custom-ca} -If ClickHouse must trust a CA different from the one in its server certificate (for -example, Keeper signed by a separate CA), supply a `caBundle`: + +By default ClickHouse verifies the peers it connects to (other replicas, Keeper, HTTPS +dictionary sources, S3, …) against the **system trust store**. To **additionally** trust +a private CA — a self-signed or internal CA whose root is not in the system store — +supply a `caBundle`: ```yaml spec: @@ -243,10 +253,14 @@ spec: key: ca.crt ``` -The operator mounts this bundle and points the client-side `openSSL` verification at -it instead of the certificate's own `ca.crt`. +The operator mounts this bundle and adds it to the `openSSL` client trust store +(`caConfig`). The system trust store stays in effect — your private CA is trusted **in +addition to** the public roots, so connections to public endpoints keep working. For a +self-signed setup, point `caBundle` at the `ca.crt` key of the same Secret cert-manager +wrote (as in the `cluster_with_ssl` example). ## Customizing the TLS settings {#custom-tls-settings} + The `openSSL` block the operator generates is a default, not a ceiling. It is written into the main server configuration; anything under `spec.settings.extraConfig` is rendered to `config.d/99-extra-config.yaml`, which ClickHouse merges **last** — so it overrides the @@ -273,6 +287,7 @@ for the available options, and for how `extraConfig` is merged. ## Verify and troubleshoot {#troubleshoot} + **Confirm the secure ports are live on the headless Service:** ```bash @@ -285,17 +300,18 @@ kubectl -n get svc -clickhouse-headless \ ```bash kubectl -n exec -- ls /etc/clickhouse-server/tls/ -# ca-bundle.crt clickhouse-server.crt clickhouse-server.key +# clickhouse-server.crt clickhouse-server.key (plus custom-ca.crt when caBundle is set) ``` -| Symptom | Likely cause | -|---|---| -| Pods fail to start / volume mount error after enabling TLS | The referenced Secret is missing or lacks `tls.crt`/`tls.key`/`ca.crt`. The operator does not validate the Secret's contents — missing keys surface as a pod volume-mount failure, not a dedicated status condition. Inspect the pod with `kubectl describe pod`. | -| Webhook rejects the cluster | `required: true` set without `enabled: true`, or `enabled: true` without `serverCertSecret`. | -| Client `certificate verify failed` | Client is not trusting the CA. Pass the `ca.crt` from the Secret, or check the `dnsNames` on the certificate cover the host you connect to. | -| A plaintext client suddenly can't connect | `required: true` removed ports `9000`/`8123`. Switch the client to `9440`/`8443`, or set `required: false` to keep insecure ports open during migration. | +| Symptom | Likely cause | +|------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| Pods fail to start / volume mount error after enabling TLS | The referenced Secret is missing or lacks `tls.crt`/`tls.key` (or, when `caBundle` is set, the Secret/key it references). The operator does not validate the Secret's contents — missing keys surface as a pod volume-mount failure, not a dedicated status condition. Inspect the pod with `kubectl describe pod`. | +| Webhook rejects the cluster | `required: true` set without `enabled: true`, or `enabled: true` without `serverCertSecret`. | +| Client `certificate verify failed` | Client is not trusting the CA. Pass the `ca.crt` from the Secret, or check the `dnsNames` on the certificate cover the host you connect to. | +| A plaintext client suddenly can't connect | `required: true` removed ports `9000`/`8123`. Switch the client to `9440`/`8443`, or set `required: false` to keep insecure ports open during migration. | ## See also {#see-also} + - [Configuration → TLS/SSL configuration](/products/kubernetes-operator/guides/configuration#tls-ssl-configuration) — field reference - [Configuration → `additionalPorts`](/products/kubernetes-operator/guides/configuration#additional-ports) — reserved ports - [API Reference → ClusterTLSSpec](/products/kubernetes-operator/reference/api-reference#clustertlsspec) diff --git a/products/kubernetes-operator/install/helm.mdx b/products/kubernetes-operator/install/helm.mdx index 06e8f2e5..eefd3425 100644 --- a/products/kubernetes-operator/install/helm.mdx +++ b/products/kubernetes-operator/install/helm.mdx @@ -10,11 +10,13 @@ sidebarTitle: 'Helm' This guide covers installing the ClickHouse Operator using Helm charts. ## Prerequisites {#prerequisites} + - Kubernetes cluster v1.28.0 or later - Helm v3.0 or later - kubectl configured to communicate with your cluster ## Install Helm {#install-helm} + If you don't have Helm installed: ```bash @@ -28,6 +30,7 @@ helm version ``` ## Install the Operator {#install-the-operator} + By default Helm chart deploys ClickHouse Operator with webhooks enabled and requires cert-manager installed. @@ -37,6 +40,7 @@ helm install cert-manager oci://quay.io/jetstack/charts/cert-manager -n cert-man ``` ### From OCI Helm repository {#from-oci-helm-repository} + Install the latest release ```bash helm install clickhouse-operator oci://ghcr.io/clickhouse/clickhouse-operator-helm \ @@ -53,6 +57,7 @@ Install a specific operator version ``` ### From Local Chart {#from-local-chart} + Clone the repository and install from the local chart: ```bash diff --git a/products/kubernetes-operator/install/kubectl.mdx b/products/kubernetes-operator/install/kubectl.mdx index 8373d9fa..afa3e1d3 100644 --- a/products/kubernetes-operator/install/kubectl.mdx +++ b/products/kubernetes-operator/install/kubectl.mdx @@ -10,11 +10,13 @@ sidebarTitle: 'kubectl' This guide covers installing the ClickHouse Operator using kubectl and manifest files. ## Prerequisites {#prerequisites} + - Kubernetes cluster v1.28.0 or later - kubectl v1.28.0 or later - Cluster admin permissions ## Install from Release Manifests {#install-from-release-manifests} + Requires cert-manager to issue webhook certificates. @@ -22,7 +24,13 @@ Requires cert-manager to issue webhook certificates. Install the operator and CRDs from the latest release: ```bash -kubectl apply -f https://github.com/ClickHouse/clickhouse-operator/releases/latest/download/clickhouse-operator.yaml +kubectl apply --server-side --force-conflicts -f https://github.com/ClickHouse/clickhouse-operator/releases/latest/download/clickhouse-operator.yaml +``` + +Server-side apply is required because the combined CRDs exceed the client-side apply size limit. For environments that only support client-side apply, use the description-stripped CRD variant: + +```bash +kubectl apply -f https://github.com/ClickHouse/clickhouse-operator/releases/latest/download/clickhouse-operator-stripped-crds.yaml ``` This will: @@ -35,6 +43,7 @@ This will: 7. Enable metrics endpoint ## Verify Installation {#verify-installation} + Check that the operator is running: ```bash @@ -60,11 +69,13 @@ keeperclusters.clickhouse.com 2025-01-06T00:00:00Z ``` ## Configure Custom Deployment Options {#configure-custom-deployment-options} + If you want to configure operator deployment options, follow the steps below. ### Clone the Repository {#clone-the-repository} + ```bash git clone https://github.com/ClickHouse/clickhouse-operator.git cd clickhouse-operator @@ -73,6 +84,7 @@ cd clickhouse-operator ### Configure installation options {#configure-installation-options} + Edit config/default/kustomization.yaml to enable/disable features as needed. * To disable webhooks, comment out the `[WEBHOOK]` and `[CERTMANAGER]` sections. @@ -83,11 +95,12 @@ Edit config/default/kustomization.yaml to enable/disable features as needed. ### Build and Deploy {#build-and-deploy} + Build the operator manifests and apply them: ```bash make build-installer VERSION= [IMG=] -kubectl apply -f dist/install.yaml +kubectl apply --server-side --force-conflicts -f dist/install.yaml ``` diff --git a/products/kubernetes-operator/install/olm.mdx b/products/kubernetes-operator/install/olm.mdx index 60e23ae0..8ff3cb14 100644 --- a/products/kubernetes-operator/install/olm.mdx +++ b/products/kubernetes-operator/install/olm.mdx @@ -10,12 +10,14 @@ sidebarTitle: 'OLM' This guide covers installing the ClickHouse Operator using Operator Lifecycle Manager (OLM). ## Prerequisites {#prerequisites} + - Kubernetes cluster version 1.28.0 or later - kubectl configured to access your cluster - Cluster admin permissions - Installed OLM (Operator Lifecycle Manager) ## Install OLM {#install-olm} + If OLM isn't already installed in your cluster, install it: ```bash @@ -27,7 +29,9 @@ curl -sL https://github.com/operator-framework/operator-lifecycle-manager/releas ``` ## Install the Operator {#install-the-operator} + ### Install from GitHub Catalog {#install-from-github-catalog} + ```bash # Create the operator namespace kubectl create namespace clickhouse-operator-system @@ -74,6 +78,7 @@ spec: EOF ``` ## Uninstall {#uninstall} + ```bash # Delete the subscription kubectl delete subscription clickhouse-operator -n clickhouse-operator-system @@ -90,5 +95,6 @@ kubectl delete operator clickhouse-operator.clickhouse-operator-system More info about uninstalling can be found in the [OLM documentation](https://olm.operatorframework.io/docs/tasks/uninstall-operator/). ## Additional Resources {#additional-resources} + - [Operator Lifecycle Manager Documentation](https://olm.operatorframework.io/docs) diff --git a/products/kubernetes-operator/navigation.json b/products/kubernetes-operator/navigation.json index 24942f13..d7b3c8eb 100644 --- a/products/kubernetes-operator/navigation.json +++ b/products/kubernetes-operator/navigation.json @@ -18,6 +18,7 @@ "pages": [ "products/kubernetes-operator/guides/introduction", "products/kubernetes-operator/guides/configuration", + "products/kubernetes-operator/guides/storage", "products/kubernetes-operator/guides/monitoring", "products/kubernetes-operator/guides/scaling", "products/kubernetes-operator/guides/tls" diff --git a/products/kubernetes-operator/overview.mdx b/products/kubernetes-operator/overview.mdx index 4be620e1..b10069a1 100644 --- a/products/kubernetes-operator/overview.mdx +++ b/products/kubernetes-operator/overview.mdx @@ -14,6 +14,7 @@ It provides declarative cluster management through custom resources, enabling us The Operator handles the full lifecycle of ClickHouse clusters including scaling, upgrades, and configuration management. ## Features {#features} + - **ClickHouse Cluster Management**: Create and manage ClickHouse clusters - **ClickHouse Keeper Integration**: Built-in support for ClickHouse Keeper clusters for distributed coordination - **Storage Provisioning**: Customizable persistent volume claims with storage class selection @@ -22,6 +23,7 @@ The Operator handles the full lifecycle of ClickHouse clusters including scaling - **Monitoring**: Prometheus metrics integration for observability ## Installation {#installation} + Choose your preferred installation method: - [Manifests Installation](/products/kubernetes-operator/install/kubectl) - Install using kubectl/kustomize @@ -29,6 +31,7 @@ Choose your preferred installation method: - [Operator Lifecycle Manager (OLM) Installation](/products/kubernetes-operator/install/olm) - Install using OLM ## Guides {#guides} + - **[Introduction](/products/kubernetes-operator/guides/introduction)** - General overview of ClickHouse Operator concepts - **[Configuration Guide](/products/kubernetes-operator/guides/configuration)** - Configure ClickHouse and Keeper clusters - **[Monitoring](/products/kubernetes-operator/guides/monitoring)** - Prometheus metrics and health probes for the operator @@ -36,4 +39,5 @@ Choose your preferred installation method: - **[Securing with TLS](/products/kubernetes-operator/guides/tls)** - Encrypt clusters end-to-end with cert-manager ## Reference {#reference} + - **[API Reference](/products/kubernetes-operator/reference/api-reference)** - Complete API documentation for custom resources diff --git a/products/kubernetes-operator/reference/api-reference.mdx b/products/kubernetes-operator/reference/api-reference.mdx index 3ce7ff37..11cb49a0 100644 --- a/products/kubernetes-operator/reference/api-reference.mdx +++ b/products/kubernetes-operator/reference/api-reference.mdx @@ -11,6 +11,7 @@ sidebarTitle: 'API reference' This document provides detailed API reference for the ClickHouse Operator custom resources. ## AdditionalPort {#additionalport} + AdditionalPort declares one extra TCP port to expose on the ClickHouse Pod and the operator-managed headless Service. | Field | Type | Description | Required | Default | @@ -21,10 +22,24 @@ AdditionalPort declares one extra TCP port to expose on the ClickHouse Pod and t Appears in: - [ClickHouseClusterSpec](#clickhouseclusterspec) +## CABundleSelector {#cabundleselector} + +CABundleSelector selects a key holding a CA bundle from a Secret in the cluster's namespace. + +| Field | Type | Description | Required | Default | +|-------|------|-------------|----------|---------| +| `name` | string | The name of the secret in the cluster's namespace to select from. | true | | +| `key` | string | The key of the secret to select from. Must be a valid secret key. | false | ca.crt | + +Appears in: +- [ClusterTLSSpec](#clustertlsspec) + ## ClickHouseCluster {#clickhousecluster} + ClickHouseCluster is the Schema for the `clickhouseclusters` API. ### API Version and Kind {#clickhousecluster-api-version-and-kind} + ```yaml apiVersion: clickhouse.com/v1alpha1 kind: ClickHouseCluster @@ -39,9 +54,11 @@ Appears in: - [ClickHouseClusterList](#clickhouseclusterlist) ## ClickHouseClusterList {#clickhouseclusterlist} + ClickHouseClusterList contains a list of ClickHouseCluster. ### API Version and Kind {#clickhouseclusterlist-api-version-and-kind} + ```yaml apiVersion: clickhouse.com/v1alpha1 kind: ClickHouseClusterList @@ -52,6 +69,7 @@ kind: ClickHouseClusterList | `items` | [ClickHouseCluster](#clickhousecluster) array | | true | | ## ClickHouseClusterSpec {#clickhouseclusterspec} + ClickHouseClusterSpec defines the desired state of ClickHouseCluster. | Field | Type | Description | Required | Default | @@ -62,6 +80,7 @@ ClickHouseClusterSpec defines the desired state of ClickHouseCluster. | `podTemplate` | [PodTemplateSpec](#podtemplatespec) | Parameters passed to the ClickHouse pod spec. | false | | | `containerTemplate` | [ContainerTemplateSpec](#containertemplatespec) | Parameters passed to the ClickHouse container spec. | false | | | `dataVolumeClaimSpec` | [PersistentVolumeClaimSpec](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#persistentvolumeclaimspec-v1-core) | Specification of persistent storage for ClickHouse data. | false | | +| `additionalVolumeClaimTemplates` | [PersistentVolumeClaimTemplate](#persistentvolumeclaimtemplate) array | Additional per-pod PVC templates for JBOD / multi-disk storage.
Each entry is propagated in StatefulSet volumeClaimTemplate, mounted at /var/lib/clickhouse/disks/ and
added to the generated JBOD storage policy.
The set of disks is fixed at creation. | false | | | `labels` | object (keys:string, values:string) | Additional labels that are added to resources. | false | | | `annotations` | object (keys:string, values:string) | Additional annotations that are added to resources. | false | | | `podDisruptionBudget` | [PodDisruptionBudgetSpec](#poddisruptionbudgetspec) | PodDisruptionBudget configures the PDB created for each shard.
When unset, the operator defaults to maxUnavailable=1 for single-replica
shards and minAvailable=1 for multi-replica shards. | false | | @@ -76,6 +95,7 @@ Appears in: - [ClickHouseCluster](#clickhousecluster) ## ClickHouseClusterStatus {#clickhouseclusterstatus} + ClickHouseClusterStatus defines the observed state of ClickHouseCluster. | Field | Type | Description | Required | Default | @@ -94,6 +114,7 @@ Appears in: - [ClickHouseCluster](#clickhousecluster) ## ClickHouseSettings {#clickhousesettings} + ClickHouseSettings defines ClickHouse server settings options. | Field | Type | Description | Required | Default | @@ -109,20 +130,22 @@ Appears in: - [ClickHouseClusterSpec](#clickhouseclusterspec) ## ClusterTLSSpec {#clustertlsspec} + ClusterTLSSpec defines cluster TLS configuration. | Field | Type | Description | Required | Default | |-------|------|-------------|----------|---------| | `enabled` | boolean | Enabled indicates whether TLS is enabled, determining if secure ports should be opened. | false | false | | `required` | boolean | Required specifies whether TLS must be enforced for all connections. Disables not secure ports. | false | false | -| `serverCertSecret` | [LocalObjectReference](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#localobjectreference-v1-core) | ServerCertSecretRef is a reference to a TLS Secret containing the server certificate.
It is expected that the Secret has the same structure as certificates generated by cert-manager,
with the certificate and private key stored under "tls.crt" and "tls.key" keys respectively. | false | | -| `caBundle` | [SecretKeySelector](#secretkeyselector) | CABundle is a reference to a TLS Secret containing the CA bundle.
If empty and ServerCertSecret is specified, the CA bundle from certificate will be used.
Otherwise, system trusted CA bundle will be used.
Key is defaulted to "ca.crt" if not specified. | false | | +| `serverCertSecret` | [LocalObjectReference](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#localobjectreference-v1-core) | ServerCertSecret is a reference to a TLS Secret containing the server certificate.
It is expected that the Secret has the same structure as certificates generated by cert-manager,
with the certificate and private key stored under "tls.crt" and "tls.key" keys respectively. | false | | +| `caBundle` | [CABundleSelector](#cabundleselector) | CABundle is a reference to a Secret key holding a CA bundle used to verify peer certificates.
If empty, the system trusted CA bundle is used.
Key is defaulted to "ca.crt" if not specified. | false | | Appears in: - [ClickHouseSettings](#clickhousesettings) - [KeeperSettings](#keepersettings) ## ConfigMapKeySelector {#configmapkeyselector} + ConfigMapKeySelector selects a key of a ConfigMap. | Field | Type | Description | Required | Default | @@ -134,6 +157,7 @@ Appears in: - [DefaultPasswordSelector](#defaultpasswordselector) ## ContainerImage {#containerimage} + ContainerImage defines a container image with repository, tag or hash. | Field | Type | Description | Required | Default | @@ -146,6 +170,7 @@ Appears in: - [ContainerTemplateSpec](#containertemplatespec) ## ContainerTemplateSpec {#containertemplatespec} + ContainerTemplateSpec describes the container configuration overrides for the cluster's containers. | Field | Type | Description | Required | Default | @@ -164,6 +189,7 @@ Appears in: - [KeeperClusterSpec](#keeperclusterspec) ## DefaultPasswordSelector {#defaultpasswordselector} + DefaultPasswordSelector selects the source for the default user's password. | Field | Type | Description | Required | Default | @@ -176,6 +202,7 @@ Appears in: - [ClickHouseSettings](#clickhousesettings) ## ExternalSecret {#externalsecret} + ExternalSecret is a reference to a Secret in the same namespace. | Field | Type | Description | Required | Default | @@ -187,6 +214,7 @@ Appears in: - [ClickHouseClusterSpec](#clickhouseclusterspec) ## ExternalSecretPolicy {#externalsecretpolicy} + ExternalSecretPolicy controls how the operator treats the external secret's content. | Field | Description | @@ -198,9 +226,11 @@ Appears in: - [ExternalSecret](#externalsecret) ## KeeperCluster {#keepercluster} + KeeperCluster is the Schema for the `keeperclusters` API. ### API Version and Kind {#keepercluster-api-version-and-kind} + ```yaml apiVersion: clickhouse.com/v1alpha1 kind: KeeperCluster @@ -215,9 +245,11 @@ Appears in: - [KeeperClusterList](#keeperclusterlist) ## KeeperClusterList {#keeperclusterlist} + KeeperClusterList contains a list of KeeperCluster. ### API Version and Kind {#keeperclusterlist-api-version-and-kind} + ```yaml apiVersion: clickhouse.com/v1alpha1 kind: KeeperClusterList @@ -228,6 +260,7 @@ kind: KeeperClusterList | `items` | [KeeperCluster](#keepercluster) array | | true | | ## KeeperClusterReference {#keeperclusterreference} + KeeperClusterReference identifies the KeeperCluster used by a ClickHouseCluster. | Field | Type | Description | Required | Default | @@ -239,6 +272,7 @@ Appears in: - [ClickHouseClusterSpec](#clickhouseclusterspec) ## KeeperClusterSpec {#keeperclusterspec} + KeeperClusterSpec defines the desired state of KeeperCluster. | Field | Type | Description | Required | Default | @@ -253,12 +287,13 @@ KeeperClusterSpec defines the desired state of KeeperCluster. | `settings` | [KeeperSettings](#keepersettings) | Configuration parameters for ClickHouse Keeper server. | false | | | `clusterDomain` | string | ClusterDomain is the Kubernetes cluster domain suffix used for DNS resolution. | false | cluster.local | | `upgradeChannel` | string | UpgradeChannel specifies the release channel for major version upgrade checks.
When empty, only minor updates will be proposed. Allowed values are: stable, lts or specific major.minor version (e.g. 25.8). | false | | -| `versionProbeTemplate` | [VersionProbeTemplate](#versionprobetemplate) | VersionProbeTemplate overrides for the version detection Job. | false | | +| `versionProbeTemplate` | [VersionProbeTemplate](#versionprobetemplate) | VersionProbeTemplate overrides for the version detection Job.
Deprecated: Keeper version probe Jobs are not used; this field is retained for backward compatibility. | false | | Appears in: - [KeeperCluster](#keepercluster) ## KeeperClusterStatus {#keeperclusterstatus} + KeeperClusterStatus defines the observed state of KeeperCluster. | Field | Type | Description | Required | Default | @@ -270,13 +305,14 @@ KeeperClusterStatus defines the observed state of KeeperCluster. | `currentRevision` | string | CurrentRevision indicates latest applied KeeperCluster spec revision. | true | | | `updateRevision` | string | UpdateRevision indicates latest requested KeeperCluster spec revision. | true | | | `observedGeneration` | integer | ObservedGeneration indicates latest generation observed by controller. | true | | -| `version` | string | Version indicates the version reported by the container image. | false | | -| `versionProbeRevision` | string | VersionProbeRevision is the image hash of the last successful version probe.
When this matches the current image hash, the cached Version is used directly. | false | | +| `version` | string | Version indicates the version reported by the Keeper server. | false | | +| `versionProbeRevision` | string | VersionProbeRevision is the image hash of the last successful version probe.
Deprecated: Keeper version probe Jobs are not used; this field is retained for backward compatibility. | false | | Appears in: - [KeeperCluster](#keepercluster) ## KeeperSettings {#keepersettings} + KeeperSettings defines ClickHouse Keeper server configuration. | Field | Type | Description | Required | Default | @@ -289,6 +325,7 @@ Appears in: - [KeeperClusterSpec](#keeperclusterspec) ## LoggerConfig {#loggerconfig} + LoggerConfig defines server logging configuration. | Field | Type | Description | Required | Default | @@ -303,7 +340,21 @@ Appears in: - [ClickHouseSettings](#clickhousesettings) - [KeeperSettings](#keepersettings) +## NamedTemplateMeta {#namedtemplatemeta} + +NamedTemplateMeta defines supported metadata settings for template objects that require a name. + +| Field | Type | Description | Required | Default | +|-------|------|-------------|----------|---------| +| `name` | string | Name is the resource identifier. | true | | +| `labels` | object (keys:string, values:string) | Labels are labels applied to the template objects. | false | | +| `annotations` | object (keys:string, values:string) | Annotations are annotations applied to the template objects. | false | | + +Appears in: +- [PersistentVolumeClaimTemplate](#persistentvolumeclaimtemplate) + ## PDBPolicy {#pdbpolicy} + PDBPolicy controls whether PodDisruptionBudgets are created. | Field | Description | @@ -315,7 +366,20 @@ PDBPolicy controls whether PodDisruptionBudgets are created. Appears in: - [PodDisruptionBudgetSpec](#poddisruptionbudgetspec) +## PersistentVolumeClaimTemplate {#persistentvolumeclaimtemplate} + +PersistentVolumeClaimTemplate is a named template for a per-replica PersistentVolumeClaim. + +| Field | Type | Description | Required | Default | +|-------|------|-------------|----------|---------| +| `metadata` | [NamedTemplateMeta](#namedtemplatemeta) | Refer to Kubernetes API documentation for fields of `metadata`. | true | | +| `spec` | [PersistentVolumeClaimSpec](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#persistentvolumeclaimspec-v1-core) | Spec defines the desired characteristics of a volume requested by a pod author.
More info: https://kubernetes.io/docs/concepts/storage/persistent-volumes#persistentvolumeclaims | true | | + +Appears in: +- [ClickHouseClusterSpec](#clickhouseclusterspec) + ## PodDisruptionBudgetSpec {#poddisruptionbudgetspec} + PodDisruptionBudgetSpec configures the PDB created for the cluster. Exactly one of MinAvailable or MaxUnavailable may be set. When neither is set, the operator picks a safe default based on replica count. @@ -332,6 +396,7 @@ Appears in: - [KeeperClusterSpec](#keeperclusterspec) ## PodTemplateSpec {#podtemplatespec} + PodTemplateSpec describes the pod configuration overrides for the cluster's pods. | Field | Type | Description | Required | Default | @@ -357,6 +422,7 @@ Appears in: - [KeeperClusterSpec](#keeperclusterspec) ## SecretKeySelector {#secretkeyselector} + SecretKeySelector selects a key of a Secret. | Field | Type | Description | Required | Default | @@ -365,10 +431,10 @@ SecretKeySelector selects a key of a Secret. | `key` | string | The key of the secret to select from. Must be a valid secret key. | true | | Appears in: -- [ClusterTLSSpec](#clustertlsspec) - [DefaultPasswordSelector](#defaultpasswordselector) ## TemplateMeta {#templatemeta} + TemplateMeta defines supported metadata settings for template objects. | Field | Type | Description | Required | Default | @@ -381,6 +447,7 @@ Appears in: - [VersionProbeTemplate](#versionprobetemplate) ## VersionProbeContainer {#versionprobecontainer} + VersionProbeContainer defines container-level overrides for the version probe. Field names and JSON tags match corev1.Container so that SMP merges by name. @@ -394,6 +461,7 @@ Appears in: - [VersionProbePodSpec](#versionprobepodspec) ## VersionProbeJobSpec {#versionprobejobspec} + VersionProbeJobSpec defines Job-level overrides for the version probe. | Field | Type | Description | Required | Default | @@ -405,6 +473,7 @@ Appears in: - [VersionProbeTemplate](#versionprobetemplate) ## VersionProbePodSpec {#versionprobepodspec} + VersionProbePodSpec defines Pod-level overrides for the version probe. Field names and JSON tags match corev1.PodSpec for strategic merge patch compatibility. @@ -419,6 +488,7 @@ Appears in: - [VersionProbePodTemplate](#versionprobepodtemplate) ## VersionProbePodTemplate {#versionprobepodtemplate} + VersionProbePodTemplate describes overrides for the version probe Pod. | Field | Type | Description | Required | Default | @@ -430,6 +500,7 @@ Appears in: - [VersionProbeJobSpec](#versionprobejobspec) ## VersionProbeTemplate {#versionprobetemplate} + VersionProbeTemplate defines overrides for the version detection Job. The structure mirrors batchv1.JobTemplateSpec, exposing only supported fields.