Blog: Turning LIMIT into an I/O Optimization: Inside DataFusion’s Multi-Layer Pruning Stack#155

Open

xudong963 wants to merge 2 commits intoapache:mainfrom

xudong963:site/limit-pruning

Member

xudong963 commented Mar 10, 2026

No description provided.

xudong963 changed the title ~~Blog: limit pruning~~ Blog: Turning LIMIT into an I/O Optimization: Inside DataFusion's Limit Pruning

Member Author

xudong963 commented Mar 10, 2026

the staging depoly failed, looks like I don't have the privilage:

fatal: unable to access 'https://github.com/apache/datafusion-site/': The requested URL returned error: 403
Error: Process completed with exit code 128.

xudong963 requested a review from alamb

March 10, 2026 07:31


          Blog: limit pruning

080696d

xudong963 force-pushed the site/limit-pruning branch from 577fcb2 to 080696d Compare

March 10, 2026 08:33

xudong963 changed the title ~~Blog: Turning LIMIT into an I/O Optimization: Inside DataFusion's Limit Pruning~~ Blog: Turning LIMIT into an I/O Optimization: Inside DataFusion’s Multi-Layer Pruning Stack

Contributor

alamb commented Mar 10, 2026

Thanks @xudong963 -- I will review this one, though maybe not for a few days

Member Author

xudong963 commented Mar 10, 2026

Thanks @xudong963 -- I will review this one, though maybe not for a few days

Thanks!, no hurry

alamb approved these changes

View reviewed changes

Contributor

alamb left a comment

This is amazing @xudong963 -- very nice 🙏

I left some style suggestions, but nothing I think is required.

Again, really nice

content/blog/2026-03-10-limit-pruning.md


		Before diving into limit pruning, let's understand the full pruning pipeline. DataFusion scans Parquet data through a series of increasingly fine-grained filters, each one eliminating data so the next stage processes less:

		<figure>

Contributor

alamb Mar 13, 2026

content/blog/2026-03-10-limit-pruning.md Outdated


		### Phase 1: High-Level Discovery

		- Partition Pruning: The `ListingTable` component evaluates filters that depend only on partition columns — things like `year`, `month`, or `region` encoded in directory paths (e.g., `s3://data/year=2024/month=01/`). Irrelevant directories are eliminated before we even open a file.

Contributor

alamb Mar 13, 2026

Minor suggestion is to link the ListingTable to its API docs: https://docs.rs/datafusion/latest/datafusion/datasource/listing/struct.ListingTable.html

Contributor

alamb Mar 13, 2026

Similarly for FilePruner: https://docs.rs/datafusion/latest/datafusion/physical_optimizer/pruning/struct.FilePruner.html

content/blog/2026-03-10-limit-pruning.md Outdated


		### Phase 2: Row Group Statistics

		For each surviving file, DataFusion reads row group metadata and classifies each row group into one of three states:

Contributor

alamb Mar 13, 2026

It might help to also mention that the example data (like "snow vole") in the images is from the snowflake paper as otherwise it seems somewhat out of context

content/blog/2026-03-10-limit-pruning.md Outdated

+              - **Partially Matching**: Statistics cannot rule out matching rows, but also cannot guarantee them. These groups might be scanned and verified row by row later.
+              - **Fully Matching**: Statistics prove that *every single row* in the group satisfies the predicate. This state is key to making limit pruning possible.
+              Additionally, **bloom filters** could eliminate row groups for equality and `IN`-list predicates at this stage.

Contributor

alamb Mar 13, 2026

You could potentially combine this into the introduction sentence instead to make the doc more concise. Something like instead of

For each surviving file, DataFusion reads row group metadata and classifies each row group into one of three states:

Change to

For each surviving file, DataFusion reads row group metadata and potentially BloomFilters and classifies each row group into one of three states:

content/blog/2026-03-10-limit-pruning.md Outdated


		The final phase goes even deeper:

		- Page Index Pruning: Parquet pages have their own min/max statistics. DataFusion uses these to skip individual data pages within a surviving row group.

Contributor

alamb Mar 13, 2026

It might help to link to the relevant parquet doc: https://parquet.apache.org/docs/file-format/pageindex/

content/blog/2026-03-10-limit-pruning.md

+. Negate the original predicate
+. Simplify the negated expression
+. Evaluate the negation against the row group's statistics

Contributor

alamb Mar 13, 2026

maybe worth pointing out that DataFusion already had steps 2 and 3 so the change is relatively straightforward

content/blog/2026-03-10-limit-pruning.md Outdated


		## Case Study: Alpine Wildlife Query

		Let's walk through a concrete example. Given a wildlife tracking dataset with four row groups:

Contributor

alamb Mar 13, 2026

I again think it would be good to credit the snowflake paper for this example

content/blog/2026-03-10-limit-pruning.md Outdated


		There are two natural extensions of this work:

		Page-Level Limit Pruning: Today, "fully matched" detection operates at the row group level. If we extend this to use page index statistics, we could stop decoding pages within a row group once the limit is met. This would pay dividends for wide row groups where only a few pages hold matching data.

Contributor

alamb Mar 13, 2026

If there are tickets, I would recommend linking here (as a way to fish for contributions).
If there are not ticekts, maybe we should file them 🎣

Member Author

xudong963 Mar 14, 2026

I have a PR for it in our fork

content/blog/2026-03-10-limit-pruning.md Outdated


		Page-Level Limit Pruning: Today, "fully matched" detection operates at the row group level. If we extend this to use page index statistics, we could stop decoding pages within a row group once the limit is met. This would pay dividends for wide row groups where only a few pages hold matching data.

		Row Filter Hints: Even when a row group is fully matched, the current row filter still evaluates predicates row by row. If we pass the fully matched groups info into the row filter builder, we can skip predicate evaluation entirely for guaranteed groups — saving CPU cycles on predicate evaluation.

Contributor

alamb Mar 13, 2026

Likewise here is a good place to add a link to the ticket

Member Author

xudong963 Mar 14, 2026

ditto

content/blog/2026-03-10-limit-pruning.md


		DataFusion's pruning pipeline trims redundant I/O from the partition level all the way down to individual rows. Limit pruning adds a new step that creates an early exit when fully matched row groups already satisfy the `LIMIT`. The result is fewer row groups scanned, less data decoded, and faster queries.

		The key insights are:

Contributor

alamb Mar 13, 2026

I also recommend ending with a call to action / come help us type blurb -- something like https://datafusion.apache.org/blog/2026/02/02/datafusion_case/#about-datafusion and https://datafusion.apache.org/blog/2026/02/02/datafusion_case/#how-to-get-involved


          address comments

c54a3fe

Member Author

xudong963 commented Mar 14, 2026

@alamb Thanks for the review, addressed all comments c54a3fe

Member Author

xudong963 commented Mar 16, 2026

@alamb iirc, the feature is included in the DF53, so we can merge the PR before writing the DF53 blog, and mention the feature by the blog link.

Contributor

alamb commented Mar 16, 2026

@alamb iirc, the feature is included in the DF53, so we can merge the PR before writing the DF53 blog, and mention the feature by the blog link.

That is a good call - I am hoping to start on a DF 53 blog later this week

I am happy to merge this PR on whatever timescale you want (not sure if you want to solicit any more reviews)

Member Author

xudong963 commented Mar 17, 2026

(not sure if you want to solicit any more reviews)

Yeah, I'd like to see more reviews, so I plan to keep it open until the DF53 blog is near

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet