[GLUTEN-11605][VL] Write per-block column statistics in shuffle writer by acvictor · Pull Request #11769 · apache/gluten

acvictor · 2026-03-16T10:53:59Z

What changes are proposed in this pull request?

This PR adds per-block column statistics (min/max/hasNull) to the shuffle writer pipeline as a prerequisite for block-level pruning using dynamic filters at the shuffle reader. When spark.gluten.sql.columnar.backend.velox.valueStream.dynamicFilter.enabled is true, the shuffle writer computes per-column min/max statistics from raw Arrow buffers during evictBuffers() and serializes them as a kStatisticsPayload block before each non-dictionary payload in the output file. This mirrors how parquet row group statistics enable predicate pushdown.

How was this patch tested?

Added new tests and also ran the CI with config set to true.

Was this patch authored or co-authored using generative AI tooling?

No

Related issue: #11605

github-actions bot added the VELOX label Mar 16, 2026

acvictor force-pushed the acvictor/writerChanges branch 6 times, most recently from cb073fd to 19b8d5a Compare March 16, 2026 11:46

Initial changes

50e0444

acvictor force-pushed the acvictor/writerChanges branch from 19b8d5a to 50e0444 Compare March 16, 2026 13:51

Test with true

336f6de

acvictor marked this pull request as ready for review March 17, 2026 13:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GLUTEN-11605][VL] Write per-block column statistics in shuffle writer#11769

[GLUTEN-11605][VL] Write per-block column statistics in shuffle writer#11769
acvictor wants to merge 2 commits intoapache:mainfrom
acvictor:acvictor/writerChanges

acvictor commented Mar 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

acvictor commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes are proposed in this pull request?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

acvictor commented Mar 16, 2026 •

edited

Loading