Skip to content

fix(tesseract): reduce_by/grain matches time dimensions regardless of granularity#11188

Merged
waralexrom merged 1 commit into
masterfrom
tesseract-reduce_by-fixes
Jul 2, 2026
Merged

fix(tesseract): reduce_by/grain matches time dimensions regardless of granularity#11188
waralexrom merged 1 commit into
masterfrom
tesseract-reduce_by-fixes

Conversation

@waralexrom

Copy link
Copy Markdown
Member

Summary

A multi-stage reduce_by / grain reference to a time dimension silently did nothing, producing wrong results with no error. Fixes #10854.

The projected time dimension carries a granularity suffix in its full name (e.g. created_at_month), while the reduce_by reference points at the bare dimension (created_at). Matching was done by full_name string equality, so the two never matched and the time dimension was never removed from the window's PARTITION BY. Every partition collapsed to a single row — a rank measure returned 1 for every row, and an aggregate silently fell back to a plain group-by returning the un-reduced value. String dimensions have no suffix, so they matched and worked; only time dimensions were affected.

Changes

  • Add MemberSymbol::matches_grain_reference — behaves like has_member_in_reference_chain, but a time dimension additionally matches through its underlying base dimension, so a bare grain reference removes it regardless of the granularity it is queried at.
  • Use it in both grain reshape sites (member_partition_by_logical and partition_filter) for exclude / keep_only.

Testing

  • New Rust integration tests (real Postgres) covering the cube and view paths: multi_stage/reduce_by.rs, rank.rs, views.rs. Each asserts a window is used, created_at is excluded from PARTITION BY, and snapshots the executed results (per-group totals / correct ranks). All were red before the fix, green after.
  • Full cubesqlplanner suite: 1073 passed, 0 failed.

@github-actions github-actions Bot added the rust Pull requests that update Rust code label Jul 1, 2026
@claude

claude Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Claude finished @waralexrom's task in 1m 16s —— View job


Code Review

LGTM — no blocking issues found (0 high, 0 medium, 1 low)
  • Read PR diff and analyze fix
  • Verify matches_grain_reference semantics
  • Check both call sites are updated symmetrically
  • Review test coverage

Summary

The root-cause analysis in the PR description matches the code: projected time dimensions carry a granularity suffix (created_at_month), while reduce_by / grain references point at the bare dimension (created_at), so the old full_name-equality chain check silently missed them. The fix — a dedicated matches_grain_reference that additionally walks through TimeDimension::base_symbol() — is minimal, well-scoped, and correctly applied at both grain-reshape sites (member_partition_by_logical and partition_filter, exclude + keep_only).

Code quality is good: new method is documented with the why, the helper name is clearer than reusing has_member_in_reference_chain, and the change is additive (does not affect string-dimension matching, which already worked).

Correctness

rust/cube/cubesqlplanner/cubesqlplanner/src/planner/symbols/member_symbol.rs:246 — the implementation delegates to has_member_in_reference_chain first, and only falls back to base_symbol() for TimeDimension. base_symbol() returns the underlying dimension symbol; walking its own reference chain covers view aliases and cross-cube joins the same way the existing path does. No behavior change for Dimension / Measure / MemberExpression.

Both call sites (member_query_planner.rs:545,553 and multi_stage_query_planner.rs:297,305) swap the method consistently — exclude and keep_only are updated symmetrically, so there is no asymmetric filter risk.

Testing

test_reduce_by_time_dimension, test_rank_reduce_by_time_dimension, and the corresponding view test each assert three orthogonal properties (window is used, created_at is absent from every PARTITION BY, snapshot of the actual per-partition rows). Snapshots for all three are present. The partition_by_clauses helper is test-only and safe on the generated SQL (no comments/strings containing PARTITION BY in the fixture output).

Low — nit only

partition_by_clauses scans between PARTITION BY and the next ORDER BY (rust/cube/cubesqlplanner/cubesqlplanner/src/tests/integration/multi_stage/reduce_by.rs:32). Fine for these fixtures, but a partition-only window (OVER (PARTITION BY x) with no ORDER BY) or a query with ORDER BY inside a subquery preceding the window would confuse the parser. Not worth changing here — just a note if this helper gets reused. Fix this →

· Branch: `tesseract-reduce_by-fixes`

@codecov

codecov Bot commented Jul 1, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 79.26%. Comparing base (5b96418) to head (dc51a64).

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #11188      +/-   ##
==========================================
- Coverage   83.70%   79.26%   -4.45%     
==========================================
  Files         257      474     +217     
  Lines       79465    97013   +17548     
  Branches        0     3572    +3572     
==========================================
+ Hits        66517    76894   +10377     
- Misses      12948    19603    +6655     
- Partials        0      516     +516     
Flag Coverage Δ
cube-backend 59.13% <ø> (?)
cubesql 83.70% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

… granularity

A multi-stage reduce_by/grain reference to a bare time dimension silently
no-oped. The projected time dimension carries a granularity suffix in its
full name (created_at_month) while the reduce_by reference points at the
bare dimension (created_at), so full_name matching never removed it from
the window PARTITION BY. Every partition collapsed to one row (ranks all 1;
aggregates fell back to plain group-by returning the un-reduced value).

Add MemberSymbol::matches_grain_reference, which also matches a time
dimension through its base symbol, and use it in both grain reshape sites
(member_partition_by_logical and partition_filter).

Covered by new Rust integration tests (cube and view paths) in
multi_stage/reduce_by.rs, rank.rs and views.rs.
@waralexrom waralexrom force-pushed the tesseract-reduce_by-fixes branch from 8768482 to dc51a64 Compare July 2, 2026 08:32
@claude

claude Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Claude finished @waralexrom's task in 35s —— View job


Code Review

LGTM — no blocking issues (0 high, 0 medium, 1 low nit). No new commits since prior review.
  • Check for new commits since last review
  • Re-verify matches_grain_reference semantics
  • Confirm both call sites updated symmetrically
  • Review test coverage

Summary

Branch still contains a single commit (dc51a64) — the same diff reviewed on 2026-07-01. Re-verified against the current tree; findings unchanged and the PR already has an approving human review from @ovr.

The fix — a dedicated matches_grain_reference that additionally walks through TimeDimension::base_symbol() — is minimal, well-scoped, and correctly applied at both grain-reshape sites (member_partition_by_logical and partition_filter, exclude + keep_only). No behavior change for Dimension / Measure / MemberExpression.

Correctness

rust/cube/cubesqlplanner/cubesqlplanner/src/planner/symbols/member_symbol.rs:246 — delegates to has_member_in_reference_chain first, falls back to base_symbol() for TimeDimension. Walking the base symbol's own reference chain covers view aliases and cross-cube joins the same way the existing path does.

Both call sites (member_query_planner.rs:545,553 and multi_stage_query_planner.rs:297,305) swap the method consistently — exclude and keep_only are updated symmetrically, so no asymmetric filter risk.

Testing

test_reduce_by_time_dimension, test_rank_reduce_by_time_dimension, and the view test each assert three orthogonal properties (window is used, created_at is absent from every PARTITION BY, snapshot of the per-partition rows). All three snapshots are present.

Low — nit only

partition_by_clauses scans between PARTITION BY and the next ORDER BY (rust/cube/cubesqlplanner/cubesqlplanner/src/tests/integration/multi_stage/reduce_by.rs:32). Fine for these fixtures, but a partition-only window (OVER (PARTITION BY x) with no ORDER BY) or an ORDER BY inside a preceding subquery would confuse the parser. Not worth changing here — just a note if this helper gets reused.

· Branch: tesseract-reduce_by-fixes ·

@waralexrom waralexrom merged commit b99d546 into master Jul 2, 2026
138 of 141 checks passed
@waralexrom waralexrom deleted the tesseract-reduce_by-fixes branch July 2, 2026 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rust Pull requests that update Rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tesseract: multi-stage reduce_by silently no-ops against type:time dimensions

2 participants