Skip to content

26.3 Antalya port - Alternative syntax for cluster functions#1640

Merged
zvonand merged 35 commits into
antalya-26.3from
frontport/antalya-26.3/alternative_syntax
May 7, 2026
Merged

26.3 Antalya port - Alternative syntax for cluster functions#1640
zvonand merged 35 commits into
antalya-26.3from
frontport/antalya-26.3/alternative_syntax

Conversation

@ianton-ru
Copy link
Copy Markdown

@ianton-ru ianton-ru commented Apr 13, 2026

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Frontports for Antalya 26.3

CI/CD Options

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • All Regression
  • Disable CI Cache

Regression jobs to run:

  • Fast suites (mostly <1h)
  • Aggregate Functions (2h)
  • Alter (1.5h)
  • Benchmark (30m)
  • ClickHouse Keeper (1h)
  • Iceberg (2h)
  • LDAP (1h)
  • Parquet (1.5h)
  • RBAC (1.5h)
  • SSL Server (1h)
  • S3 (2h)
  • S3 Export (2h)
  • Swarms (30m)
  • Tiered Storage (2h)

zvonand and others added 2 commits April 10, 2026 16:50
…tive_syntax

26.1 Antalya port - Alternative syntax for cluster functions
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 13, 2026

Workflow [PR], commit [48ff375]

zvonand and others added 3 commits April 13, 2026 15:15
…e_for_partitioning

26.1 Antalya port - Timezone for partitioning
…nquoted_timezone

Fix unquoted timezone in sorting key
@ianton-ru ianton-ru changed the title 26.1 Antalya port - Alternative syntax for cluster functions 26.3 Antalya port - Alternative syntax for cluster functions Apr 13, 2026
@ianton-ru
Copy link
Copy Markdown
Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 88842054aa

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +721 to +722
pure_storage->renameInMemory(new_table_id);
return;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Update wrapper storage ID during renameInMemory

StorageObjectStorageCluster::renameInMemory returns after renaming only pure_storage, but pure_storage is always present, so the wrapper's own storage_id is never updated. After RENAME TABLE, calls that read StorageObjectStorageCluster::getStorageID() can still see the old identifier, which can break catalog/dependency logic that relies on the storage object reporting its current name.

Useful? React with 👍 / 👎.

Comment thread src/Storages/IStorageCluster.cpp Outdated
Comment on lines +247 to +248
static pcg64 rng(randomSeed());
size_t shard_num = rng() % host_addresses.size();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use thread-safe RNG when choosing remote initiator host

The function keeps a process-wide static pcg64 and advances it on each call. When multiple queries use object_storage_remote_initiator concurrently, they race on this shared RNG state, which is undefined behavior in C++. This can lead to flaky host selection and hard-to-diagnose concurrency issues; use thread_local RNG or guard access with synchronization.

Useful? React with 👍 / 👎.

@ianton-ru
Copy link
Copy Markdown
Author

Test test_s3_cluster/test.py::test_object_storage_remote_initiator is depended on #1646

/// controls new feature and it's 'true' by default, use 'false' as previous_value).
/// It's used to implement `compatibility` setting (see https://github.com/ClickHouse/ClickHouse/issues/35972)
/// Note: please check if the key already exists to prevent duplicate entries.
addSettingsChanges(settings_changes_history, "26.3.1.20001.altinityantalya",
Copy link
Copy Markdown
Collaborator

@zvonand zvonand Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am making an update for our settingschangeshistory.

I will later ask you to move all your settings to a corresponding section: whether they appeared in 25.8, in 26.1 or in any other version. It is not correct to put them all into the latest release block, we won't do this anymore

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will have commented lines for all settings that existed in 26.1. In future, you will add new settings, but for forward-ports you will uncomment corresponding lines

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merged

@ianton-ru ianton-ru added the port-antalya PRs to be ported to all new Antalya releases label Apr 19, 2026
@zvonand
Copy link
Copy Markdown
Collaborator

zvonand commented May 1, 2026

test_s3_cluster are failing -- looks like it is related to the PR's subject (@ianton-ru)

ianton-ru and others added 3 commits May 7, 2026 09:30
@zvonand zvonand merged commit ff71e89 into antalya-26.3 May 7, 2026
290 of 313 checks passed
@alsugiliazova
Copy link
Copy Markdown
Member

Verification report: Altinity/ClickHouse PR #1640

Conclusion

PR is merged. CI red on head, but every failure is either a pre-existing flake or a regression-suite scenario broken at baseline. No PR-caused regression found. The high regression-DB fail count (≈23%) is misleading — it is dominated by suites that already fail ~50–80% on every PR (swarms, parquet/postgresql, export-partition sanity).

Caveat — partial frontport. PR #1640 is a frontport bundle that carries forward only a subset of the related antalya-26.1 features (alternative cluster-function syntax, timezone for partitioning, unquoted timezone fix, remote initiator host name fix, remote initiator improvements). Downstream features that depend on companion PRs not yet frontported to antalya-26.3 (e.g., remote-initiator improvements 2 from #1608/#1638, the CREATE TABLE from table function fix from #1701, the system.databases data-lake-catalog fix from #1690, and any export-partition / parquet plumbing landed only on 26.1) may cause regression failures on antalya-26.3 that look related to this PR but are actually missing-dependency symptoms. The baseline-rate analysis below already confirms that swarms / parquet / export-partition fail at the same rate on unrelated PRs against antalya-26.3, so the failure pattern is consistent with the branch lacking those companion frontports, not with this PR introducing breakage. Re-verify once the dependency frontports land.


CI on head 48ff375d — failures

PR test workflow (3 failed checks, 42 success)

Check Test FAIL Class
Stateless tests (amd_debug, sequential) 00157_cache_dictionary Pre-existing flake — 106 fails / 25 PRs / since 2026-03-02
Stateless tests (arm_asan, azure, sequential, 2/2) 03443_shared_storage_snapshots Pre-existing flake — 30 fails / 18 PRs / since 2026-02-14
Stress test (arm_asan) Test script failed, Possible deadlock on shutdown (see gdb.log) Pre-existing stress instability — 10 / 7 PRs and 6 / 5 PRs respectively, since 2026-04-13

Regression workflow (8 failed checks)

Check Top failing test on PR-1640 builds (30d) Baseline fail rate (30d, all PRs) Class
RegressionTestsRelease / Swarms + Aarch64 / Swarms /swarms/feature/node failure/initiator out of disk space (×24) 78% (133 OK / 483 Fail) Pre-existing broken
same /swarms/feature/swarm joins/join clause (×24) 49% (305 OK / 293 Fail) Pre-existing flaky
RegressionTestsRelease / S3Export (partition) + Aarch64 /s3/minio/export tests/export partition/sanity/basic table (×24) 50% (282 OK / 286 Fail) Pre-existing broken
same …/sanity/replicated partition exports local mode peer replica (×24) 60% (210 OK / 311 Fail) Pre-existing broken
same …/sanity/partition export tight pool lock inside task (×24) 59% (212 OK / 309 Fail) Pre-existing broken
RegressionTestsRelease / S3Export (part) + Aarch64 /s3 suite-level (×14 fails total) flaky Pre-existing flaky
RegressionTestsRelease / Parquet + Aarch64 /parquet/postgresql/compression type/=GZIP etc. (×24 each) 36% (649 OK / 360 Fail per case) Pre-existing flaky

Regression DB on /PRs/1640/ builds (30d): 22,673 Fail / 73,151 OK ≈ 23.6% — looks alarming but is entirely accounted for by the per-test baseline rates above (swarms join-clause alone enumerates >800k parameter combinations and inherits ~50% baseline fail rate, so it dominates the absolute count). No regression test fails on PR-1640 above its all-PR baseline.

@alsugiliazova
Copy link
Copy Markdown
Member

alsugiliazova commented May 11, 2026

Audit Report — PR #1640

Scope

This PR is a frontport bundle carrying forward the following feature groups from antalya-26.1 to antalya-26.3:

Group Key PRs
Alternative syntax for cluster functions (object storage) #1390, #592, #615, #648, #675, #677, #712, #710, #719, #746, #750, #761, #1074, #1120, #1137, #1234
Remote initiator feature #756, #1551, #1577, #1608
Iceberg improvements (timezone, profile metrics, prune) #898, #1103, #1123, #770, #822, #860
DataLakeCatalog namespace filter #1455, #1337
Timezone for partitioning #1453
Unquoted timezone fix in sorting key #1526

~120 files changed, ~11 000 diff lines. Reviewed partitions: cluster-dispatch core, fallback table function mechanism, remote-initiator host-selection, S3 URI dedup fix, REST catalog namespace filter, SettingsChangesHistory, StorageFileCluster/StorageURLCluster signature changes.


Confirmed Defects

HIGH — cluster_name_in_settings: persistent shared mutable state corrupts query rewriting

Status: Open.

Anchor StorageObjectStorageCluster.cpp / updateQueryForDistributedEngineIfNeededcluster_name_in_settings = true (line 418); read at lines 453, 467 in updateQueryToSendIfNeeded
Impact cluster_name_in_settings is an instance field initialized to false and set to true during the first cluster query, but never reset. All subsequent non-cluster reads on the same persistent storage object (CREATE TABLE … ENGINE=…) observe stale true, causing wrong branch selection in updateQueryToSendIfNeeded: wrong flag passed to extractDynamicStorageType, wrong argument construction for the rewritten query AST. Concurrent reads also data-race on the unsynchronized bool (C++ UB).
Trigger Any CREATE TABLE … ENGINE=IcebergS3(…) (or any other StorageObjectStorageCluster engine) that receives at least one cluster query (object_storage_cluster set), followed by any remote-initiator query (object_storage_remote_initiator=1) on the same table object — or two concurrent queries of different types.
Why defect Per-call state (whether the cluster name was injected into the query SETTINGS clause during this specific rewrite) was stored as a per-object field, conflating call-scoped state with object-scoped state.
Fix direction Make updateQueryForDistributedEngineIfNeeded return bool instead of writing to the instance field. In updateQueryToSendIfNeeded, use a local effective_cluster_name_in_settings = cluster_name_in_settings || cluster_name_was_set. The instance field should only be written at construction time (via setClusterNameInSettings), which is the legitimate use for table functions.
Regression test direction Two concurrent SELECT queries on the same StorageObjectStorageCluster table, one with and one without object_storage_cluster; verify neither produces wrong results or a LOGICAL_ERROR.

MEDIUM — Null dereference in remote-initiator path with object_storage_cluster_join_mode='local' + allow_experimental_analyzer=0

Status: Open.

Anchor IStorageCluster.cpp / IStorageCluster::read, remote-initiator branch (line 330): updateQueryWithJoinToSendIfNeeded(query_to_send, query_info.query_tree, context)
Impact Server exception — query_tree->as<QueryNode &>() on a null pointer.
Trigger object_storage_remote_initiator=1 + object_storage_cluster_join_mode='local' + allow_experimental_analyzer=0 + query with a JOIN. The cluster path guards this combination via IStorageCluster::getQueryProcessingStage (throws NOT_IMPLEMENTED before read is reached). The remote-initiator path (cluster_name_from_settings.empty()) returns FetchColumns early from StorageObjectStorageCluster::getQueryProcessingStage and never reaches that guard.
Why defect The remote-initiator branch calls updateQueryWithJoinToSendIfNeeded unconditionally, while the existing guard only fires in the cluster branch.
Fix direction Either add the !allow_experimental_analyzer check to the remote-initiator call site (throw NOT_IMPLEMENTED for the same combination), or null-check query_tree before entering getQueryTreeInfo.
Regression test direction Query with JOIN on a cluster-fallback table function with the three settings above; expect a clear NOT_IMPLEMENTED exception, not a null-deref.

LOW — engine_to_function and function_to_cluster_function maps are out of sync

Status: Open.

Anchor StorageObjectStorageCluster.cpp / updateQueryForDistributedEngineIfNeeded (engine_to_function static map) and updateQueryToSendIfNeeded (function_to_cluster_function static map)
Impact LOGICAL_ERROR at query time for engine types missing from one of the two maps (e.g. DeltaLakeLocal is in engine_to_function with no corresponding cluster entry in function_to_cluster_function; COSN/GCS/OSS are in engine_to_function but absent from function_to_cluster_function).
Trigger CREATE TABLE … ENGINE=DeltaLakeLocal(…) with object_storage_cluster set, or any other engine whose name appears in one map but not the other.
Why defect The two complementary static maps are maintained independently; adding a new engine type to one without updating the other silently creates a LOGICAL_ERROR path at query time.
Fix direction Replace the two maps with a single bidirectional lookup table (or add a startup/compile-time assertion that both directions are complete for every registered engine).
Regression test direction SELECT with object_storage_cluster against a table of each engine type in both maps; verify no LOGICAL_ERROR.

Previously Flagged Defects (Codex P1/P2) — Both Confirmed Fixed

Codex finding Fix status
P1 — StorageObjectStorageCluster::renameInMemory returned after renaming only pure_storage, never calling IStorageCluster::renameInMemory; wrapper storage_id was never updated. Fixed — current code calls IStorageCluster::renameInMemory(new_table_id) after pure_storage->renameInMemory(new_table_id).
P2 — static pcg64 rng(randomSeed()) in host-selection path was shared across threads (data race / UB). Fixed — changed to a local pcg64 rng(randomSeed()) in IStorageCluster::convertToRemote; no shared state.

Coverage Summary

Scope reviewed IStorageCluster.cpp/h, StorageObjectStorageCluster.cpp/h, TableFunctionObjectStorageClusterFallback.cpp/h, TableFunctionObjectStorageCluster.cpp, StorageFileCluster.cpp, StorageURLCluster.cpp, IStorageCluster::convertToRemote (remote-initiator host selection), S3/URI.cpp (region dedup), RestCatalog.cpp (namespace filter), SettingsChangesHistory.cpp
C++ bug classes covered Memory lifetime ✓, RAII/resource leaks ✓, data races ✓ (found one, open), integer overflow ✓, credentials in logs ✓ (host name only, no secrets), iterator invalidation ✓
Categories failed Shared-state / data-race (cluster_name_in_settings), null-safety under non-default setting combination (query_tree + join mode), map completeness (engine_to_function vs function_to_cluster_function)
Categories passed Memory lifetime (no use-after-free/move found), RAII (pure_storage always valid before use), static-local-map init (thread-safe C++11), URI region idempotency logic, renameInMemory fix, static-RNG fix
Assumptions / limits DataLake catalog namespace filter and Timezone-for-partitioning partitions reviewed at diff level only; ICatalog/HivePartitioningUtils deep class hierarchy not fully traced. Integration and stateless test changes reviewed for coverage gaps only.

@ianton-ru
Copy link
Copy Markdown
Author

ianton-ru commented May 12, 2026

LOW — engine_to_function and function_to_cluster_function maps are out of sync

It's correct.
deltaLakeLocal, cosn, gcs, oss do not have cluster versions.
paimon does not have table engine, only table function.
It may change in future.

@alsugiliazova alsugiliazova removed the verified Approved for release label May 12, 2026
@svb-alt svb-alt added the forwardport This is a frontport of code that existed in previous Antalya versions label May 12, 2026
@alsugiliazova alsugiliazova added the verified Approved for release label May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

antalya antalya-26.3 forwardport This is a frontport of code that existed in previous Antalya versions port-antalya PRs to be ported to all new Antalya releases verified Approved for release verified-with-issues Verified by QA and issues found.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants