Add ClickHouse destination by jmqd · Pull Request #658 · supabase/etl

jmqd · 2026-04-07T05:06:39Z

Adds ClickHouse as a destination to etl.

Replicates Postgres tables into append-only MergeTree tables, with cdc_operation and cdc_lsn columns appended to every row. Both initial copy and streaming are supported, encoded directly as RowBinary (rationale in encoding.rs). Per INSERT byte budgeting keeps peak memory bounded for large copies.

Schema evolution / DDL is handled by diffing against the previous snapshot and replaying it through idempotent DDL statements, so an interrupted run is recoverable on restart. RowBinary nullable flags are derived from the actual ClickHouse schema rather than the source, since ClickHouse forces newly-added columns to Nullable(T) regardless of the upstream NOT NULL constraint. Identifiers are quoted at every SQL boundary, and a validate_connectivity probe lets the etl-api ping/check for health similarly to Iceberg.

Coverage: 20 unit tests on the encoder/schema/client helpers plus 15 integration tests covering initial copy of all supported types, insert/update/delete streaming (with key-only DELETE tombstone expansion), restart, intermediate flushing, multi-table concurrency, truncate, and schema-change scenarios.

Signed-off-by: Benjamin <5719034+bnjjj@users.noreply.github.com>

Main introduced AsyncResult parameters on the Destination trait (truncate_table, write_table_rows, write_events). Update the ClickHouse implementation to conform, add missing K8sClient trait stubs, and fix imports that drifted during rebase.

These were carried over during the ClickHouse rebase but the startup path on main no longer calls them, triggering dead_code warnings.

coveralls · 2026-04-07T10:00:38Z

coverage: 76.444% (+0.3%) from 76.129% — jm/clickhouse into main

Silently skipping tests when env vars are missing gives false confidence. The tests should fail explicitly so missing configuration is always noticed.

Exercises encoding edge cases the all_types test does not cover: - NULL scalars (nullable columns with SQL NULL) - NULL elements inside arrays (ARRAY[1, NULL, 3]) - Empty strings distinct from NULLs - Single-element arrays (varint length = 0x01) - Multi-byte UTF-8 (emoji, CJK) in scalars and arrays - Zero integer distinct from NULL

The docker-compose stack already starts a ClickHouse container but the test env vars were not set, causing the integration tests to panic. Wire up the same defaults used by docker-compose.

delete_dynamic_replicator_secrets deleted BigQuery, Iceberg, and Ducklake secrets but not ClickHouse, leaving orphaned K8s secrets when ClickHouse pipelines were deleted or changed destination type.

Previously this silently wrote a zero-length string, corrupting data without any signal. Now returns a ConversionError so the problem is surfaced to the operator.

The previous ErrorKind::Unknown discarded the JoinError, losing the panic message. Now preserves the error as a detail string and uses a more specific error kind matching the DuckLake convention.

Uses REPLICA IDENTITY FULL so Postgres sends all column values in the DELETE event. Inserts two rows, deletes one, and verifies the correct row appears with cdc_operation = "DELETE", old data preserved, and a positive LSN — while the other row remains untouched.

# Conflicts: # etl-api/Cargo.toml # etl-destinations/Cargo.toml # etl-examples/Cargo.toml # etl-examples/README.md

farazdagi

Really great work!

iambriccardo

LGTM!

Co-authored-by: Victor Farazdagi <simple.square@gmail.com>

The merge from main brought `etl-replicator/src/init/destination.rs`, whose `destination_name` match did not include the `ClickHouse` variant that exists on this branch. Build broke with E0004 non-exhaustive patterns. Add the missing arm.

farazdagi

Really good work!

The `clickhouse_pipeline::*` integration tests call `spawn_source_database()` and so share the source Postgres cluster, but they were missing from the `shared-pg` filter in `.config/nextest.toml` and `xtask/src/commands/nextest.rs`. Under `cargo xtask nextest run` they were therefore landing in the non-pg lane and running in parallel against whichever cluster `TESTS_DATABASE_PORT` happens to point at, defeating the `max-threads = 1` serialization the test group is supposed to provide and risking timing-sensitive contention under load. Add `clickhouse_pipeline` to both filter copies.

ClickHouse `Date` is a UInt16 day offset from 1970-01-01 and only covers 1970-01-01..=2149-06-06. Postgres `date` has a much wider range, and the previous `date_to_days(...).clamp(...)` silently turned pre-1970 dates into 1970-01-01 and far-future dates into the ClickHouse maximum, corrupting historical and edge-case data without surfacing anything to the operator. Switch the schema mapping for `Type::DATE` and `Type::DATE_ARRAY` to ClickHouse `Date32`, swap `ClickHouseValue::Date(u16)` for `ClickHouseValue::Date32(i32)`, and convert dates through a fallible `date_to_date32_days` that returns `ConversionError` when the value falls outside `Date32`'s `1900-01-01..=2299-12-31` range. Failing the batch is strictly better than silent corruption. `cell_to_clickhouse_value` and `array_cell_to_clickhouse_values` now return `EtlResult` so the conversion error can propagate; both RowBinary call sites in `core.rs` thread the error up via `?`.

Add `pre_1970_and_far_future_dates_round_trip`, an integration test that runs the pipeline against a Postgres `date not null` column populated with the Date32 boundaries (1900-01-01, 1969-12-31, 1970-01-01, 2024-01-15, 2299-12-31) and asserts each value lands in ClickHouse as the expected signed day offset. This pins the new Date32 behavior so the previous silent-clamp regression cannot return. Update `AllTypesRow.date_col` from `u16` to `i32` to match the `Date32` schema the destination now produces.

Address review on the date-boundaries test: - Rename `DATE_1900_01_01_DAYS` -> `DATE32_MIN_DAYS_FROM_UNIX_EPOCH` and `DATE_2299_12_31_DAYS` -> `DATE32_MAX_DAYS_FROM_UNIX_EPOCH` so the constant names describe their semantic role (the Date32 boundaries) rather than the underlying date. - Drop `DATE_1969_12_31_DAYS = -1`; the literal `-1` reads more clearly inline with a description in the assertion message.

jmqd · 2026-05-04T03:41:41Z

Great work, left a few comments.

The new clickhouse_pipeline tests call spawn_source_database() but are not included in the shared-pg nextest filter. Under cargo xtask nextest run, they run in the non-pg lane instead of the sharded/serialized Postgres lanes, contrary to the test-group comment and likely to make CI scheduling flaky under load..

Good catch, fixed in cd486b8

Could we avoid clamping Postgres date values when encoding to ClickHouse ClickHouse Date is a UInt16 day offset from 1970-01-01 and only supports roughly 1970-01-01..=2149-06-06, while Postgres date has a much wider range. The current date_to_days(...).clamp(...) would silently turn pre-1970 dates into 1970-01-01 and far-future dates into ClickHouse’s max date. I think we should either map Postgres date to Date32 and still error outside the Date32 range, or keep Date but return a conversion error when the value is not representable. Silent corruption feels worse than failing the batch here.

Yeah, this also makes sense. Fixed in 6e54249 and 1fe9f2b

`etl-replicator/configuration/{base,dev,prod}.yaml` are not on main and only ever held placeholder credentials (`password: password`) plus Kubernetes service URLs. The deployment model is explicitly to provide configuration at runtime via `APP_CONFIG_DIR` (see Dockerfile and README), so committed dummy configs serve no purpose and risk normalising checked-in placeholder secrets.

Missed in the Date -> Date32 migration. CI surfaced it as `SchemaMismatch: attempting to (de)serialize ClickHouse type Date32 as u16` when `delete_with_default_replica_identity` ran. Match the row struct to the new schema.

jmqd changed the title ~~Jm/clickhouse~~ ClickHouse destination Apr 7, 2026

bnjjj and others added 10 commits April 7, 2026 17:08

add support clickhouse destination wip

ca2ffaf

Signed-off-by: Benjamin <5719034+bnjjj@users.noreply.github.com>

add support for clickhouse destination in ETL and API

01da517

Signed-off-by: Benjamin <5719034+bnjjj@users.noreply.github.com>

Add example binary

feb8d65

Signed-off-by: Benjamin <5719034+bnjjj@users.noreply.github.com>

fixes

d6bdb14

Signed-off-by: Benjamin <5719034+bnjjj@users.noreply.github.com>

Add local ClickHouse test helper

df9f0a9

Fix ClickHouse test row id type

03b6975

Avoid async cleanup panic in ClickHouse tests

f4543bd

Fix ClickHouse UUID encoding

f2430af

Add ClickHouse update streaming integration test

41c84fc

Format ClickHouse destination files

bc9a5f3

jmqd force-pushed the jm/clickhouse branch from 1cd6748 to bc9a5f3 Compare April 7, 2026 08:35

jmqd added 3 commits April 7, 2026 17:52

Remove unused log_config helpers from replicator

99dd054

These were carried over during the ClickHouse rebase but the startup path on main no longer calls them, triggering dead_code warnings.

Sort Cargo.toml files

2831b72

Remove skip_if_missing_clickhouse_env_vars from tests

04ead7c

Silently skipping tests when env vars are missing gives false confidence. The tests should fail explicitly so missing configuration is always noticed.

jmqd force-pushed the jm/clickhouse branch from 8e1e91d to 04ead7c Compare April 9, 2026 06:18

jmqd added 4 commits April 9, 2026 15:49

Add ClickHouse env vars to CI

6ba07bb

The docker-compose stack already starts a ClickHouse container but the test env vars were not set, causing the integration tests to panic. Wire up the same defaults used by docker-compose.

Fix missing ClickHouse secret deletion on pipeline teardown

363aa29

delete_dynamic_replicator_secrets deleted BigQuery, Iceberg, and Ducklake secrets but not ClickHouse, leaving orphaned K8s secrets when ClickHouse pipelines were deleted or changed destination type.

Error on NULL value in non-nullable ClickHouse column

bc376b0

Previously this silently wrote a zero-length string, corrupting data without any signal. Now returns a ConversionError so the problem is surfaced to the operator.

jmqd mentioned this pull request Apr 10, 2026

Retry BigQuery dataset creation in test setup #663

Merged

Use ApplyWorkerPanic error kind for ClickHouse JoinSet failures

935112a

The previous ErrorKind::Unknown discarded the JoinError, losing the panic message. Now preserves the error as a detail string and uses a more specific error kind matching the DuckLake convention.

jmqd force-pushed the jm/clickhouse branch from 80593f8 to 935112a Compare April 10, 2026 07:26

jmqd changed the title ~~ClickHouse destination~~ Add ClickHouse destination Apr 10, 2026

jmqd added 4 commits April 13, 2026 15:07

Remove unused PartialEq/Eq impls from AllTypesRow

0201934

Improve wait_for_update_flow_rows timeout diagnostic

6e8f6d2

Add GIVEN/WHEN/THEN structure to ClickHouse integration tests

40742da

Merge remote-tracking branch 'origin/main' into jm/clickhouse

4fa09f9

# Conflicts: # etl-api/Cargo.toml # etl-destinations/Cargo.toml # etl-examples/Cargo.toml # etl-examples/README.md

jmqd force-pushed the jm/clickhouse branch from 590ec79 to 4fa09f9 Compare April 30, 2026 07:40

farazdagi reviewed Apr 30, 2026

View reviewed changes

iambriccardo approved these changes Apr 30, 2026

View reviewed changes

jmqd and others added 7 commits May 1, 2026 12:50

Update etl-api/src/configs/destination.rs

c66924c

Co-authored-by: Victor Farazdagi <simple.square@gmail.com>

Use map().transpose() for ClickHouse password decryption

e9168b9

Drop redundant set_log_level() from ClickHouse example

d2a6821

Drop Arc around ClickHouseInserterConfig

51cd46e

Use FIRST placement when ClickHouse ADD COLUMN has no anchor

d21f04b

Merge branch 'main' into jm/clickhouse

4f79d29

jmqd force-pushed the jm/clickhouse branch from 5e22ac0 to 81fbd10 Compare May 1, 2026 04:58

farazdagi approved these changes May 1, 2026

View reviewed changes

bnjjj reviewed May 1, 2026

View reviewed changes

jmqd added 7 commits May 1, 2026 18:15

Construct ClickHouse password Secret directly without json round-trip

b9fe4af

Lower 'delete event has no row data' log to debug

55ca2b5

Pass column iterator to build_create_table_sql to drop intermediate Vec

79cd26e

jmqd added 5 commits May 4, 2026 13:05

Drop high-cardinality table label from ClickHouse DDL/INSERT histograms

4323ff6

Update ClickHouse metric descriptions to match actual labels

4e3ebe4

Merge branch 'main' into jm/clickhouse

b1c5f5c

jmqd enabled auto-merge (squash) May 4, 2026 07:38

jmqd merged commit 02a7072 into main May 4, 2026
14 checks passed

jmqd deleted the jm/clickhouse branch May 4, 2026 07:43

Uh oh!

Conversation

jmqd commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coveralls commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

farazdagi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

iambriccardo left a comment

Choose a reason for hiding this comment

Uh oh!

farazdagi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jmqd commented May 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jmqd commented Apr 7, 2026 •

edited

Loading

coveralls commented Apr 7, 2026 •

edited

Loading