[BugFix] Restore dedup pushdown when combined with WHERE clause (#5482)#5488
[BugFix] Restore dedup pushdown when combined with WHERE clause (#5482)#5488ryan-gh-bot wants to merge 3 commits into
Conversation
…search-project#5482) Run PPLSimplifyDedupRule before FilterMergeRule in the HEP optimizer so the bucket-non-null filter PPL emits for dedup is matched as-is. With the previous order, an upstream user where filter sat adjacent to the bucket-non-null filter; FilterMergeRule fired first and merged them into a conjunction that no longer satisfied PPLSimplifyDedupRule's operand predicate, defeating dedup pushdown to the shard. Use sequential addRuleInstance phases for explicit ordering rather than addRuleCollection, which is documented as non-deterministic in firing order. Adds two regression tests in CalcitePPLDedupTest: one that asserts LogicalDedup is produced under the fixed order, and one that pins the buggy behavior under the swapped order. Signed-off-by: ryan-gh-bot <ryan-gh-bot@users.noreply.github.com>
PR Reviewer Guide 🔍(Review updated until commit 0bf8f7c)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to 0bf8f7c Explore these optional code suggestions:
Previous suggestionsSuggestions up to commit 19f506d
Suggestions up to commit 8bfb649
|
RyanL1997
left a comment
There was a problem hiding this comment.
Hi @ryan-gh-bot , thanks for the change, left some reviews.
| } | ||
|
|
||
| /** Try to optimize the plan by using HepPlanner */ | ||
| private static final List<RelOptRule> hepRuleList = |
There was a problem hiding this comment.
@ryan-gh-bot can you explain the reason for removing this?
There was a problem hiding this comment.
The hepRuleList was only ever used as the argument to addRuleCollection(...). The fix here depends on PPLSimplifyDedupRule running strictly before FilterMergeRule, but per Calcite's docs addRuleCollection is non-deterministic in rule firing order — passing the rules in a List does not guarantee they fire in list order. So I switched to two sequential addRuleInstance(...) phases, which is the documented primitive for expressing an ordering invariant (each phase runs to a fixed point before the next begins).
Once the rules were inlined into the builder chain, the List<RelOptRule> field became dead code, so I dropped it (and the now-unused java.util.List / org.apache.calcite.plan.RelOptRule imports). The rule set itself is unchanged — same two rules, just expressed in a way that pins the order.
| } | ||
|
|
||
| /** | ||
| * Regression test for https://github.com/opensearch-project/sql/issues/5482 |
There was a problem hiding this comment.
@ryan-gh-bot it is unnecessary for mentioning the specific issue link here. You can just remove this.
…search-project#5488) Per maintainer review feedback, the regression-test JavaDoc for testDedupAfterWhereProducesLogicalDedup mentioned the originating issue URL. The remaining JavaDoc paragraphs already describe the bug shape and the rule-ordering invariant, so the explicit issue link is unnecessary noise. Signed-off-by: ryan-gh-bot <ryan-gh-bot@users.noreply.github.com>
|
Persistent review updated to latest commit 19f506d |
|
@ryan-gh-bot the rule-reordering fix works but treats a symptom rather than the root cause. The actual brittleness is in Please revise to take option 1 from the issue instead:
Please sanity-check that the loosened operand can't accidentally match user-written |
|
Working on |
…oject#5488) Address review feedback on opensearch-project#5488: extend mayBeFilterFromBucketNonNull to accept the merged conjunction shape FilterMergeRule produces, so PPLSimplifyDedupRule fires regardless of whether FilterMergeRule has already merged the user where clause into the bucket-non-null filter. PPLSimplifyDedupRule.apply now splits the bottom filter into IS_NOT_NULL conjuncts on partition keys (absorbed into LogicalDedup semantics) and any remaining conjuncts (preserved as a separate filter below the new LogicalDedup), so a user predicate that was folded in is no longer dropped. With the operand predicate order-independent, the HEP rule order is no longer a load-bearing invariant. Revert the addRuleCollection -> addRuleInstance change in CalciteToolsHelper.HEP_PROGRAM that the previous patch introduced. Replace the regression test that pinned the buggy rule order with one that asserts the user-visible contract: with where preceding dedup, a LogicalDedup is produced and the user predicate is preserved regardless of which order FilterMergeRule and PPLSimplifyDedupRule fire. Signed-off-by: ryan-gh-bot <ryan-gh-bot@users.noreply.github.com>
|
Done — pushed |
|
Persistent review updated to latest commit 0bf8f7c |
Description
PPL
dedupis initially compiled into aROW_NUMBER() OVER (PARTITION BY field)window pattern with anIS_NOT_NULL(field)bucket-non-null filter directly above the scan.PPLSimplifyDedupRuleis responsible for folding that pattern into aLogicalDedupoperator, whichDedupPushdownRulethen pushes to the shard as acompositeaggregation +top_hits. The HEP program registered bothFilterMergeRule.DEFAULTandPPLSimplifyDedupRule.DEDUP_SIMPLIFY_RULEviaaddRuleCollection. When a userwhereprecedesdedup, the user's filter sat adjacent to the bucket-non-null filter;FilterMergeRulefired first and merged them intoAND(IS_NOT_NULL(field), <user predicate>), which no longer satisfiedPPLSimplifyDedupRule'smayBeFilterFromBucketNonNulloperand predicate. The simplify rule never fired, noLogicalDedupwas produced, andDedupPushdownRulehad nothing to match — dedup fell through to the in-memoryROW_NUMBERwindow form on the coordinator, which exceeds timeouts on large indices.The fix replaces the single
addRuleCollectionwith two sequentialaddRuleInstancephases so each rule runs to a fixed point in a defined order:PPLSimplifyDedupRulefirst (folds the dedup pattern intoLogicalDedupwhile the bucket-non-null filter is still pure), thenFilterMergeRule(consolidates any remaining adjacent filters). Per Calcite's documentation,addRuleCollectionis non-deterministic in rule order, soaddRuleInstanceis the correct primitive to express this ordering invariant.Verification
Before (current source, prior to this PR):
source=EMP | where SAL > 1000 | dedup DEPTNOagainst the existingHEP_PROGRAMordering. The added regression testtestDedupAfterWhereWithFilterMergeFirstFailsToSimplifyexercises the exact rule sequence that production uses today (FilterMergeRule first, PPLSimplifyDedupRule second) on the un-merged plan PPL emits. The optimized plan retains theROW_NUMBERwindow form and never produces aLogicalDedup, soDedupPushdownRulecannot match:After (this branch):
The new
testDedupAfterWhereProducesLogicalDeduptest runs the same rule sequence used byCalciteToolsHelper#HEP_PROGRAMafter this PR (PPLSimplifyDedupRulefirst, thenFilterMergeRule) and asserts that the optimized plan containsLogicalDedupand no longer containsROW_NUMBER. The optimized plan now folds the dedup pattern into aLogicalDedupoperator with the userwherepredicate preserved as a separate filter below it, ready forDedupPushdownRuleto match and rewrite into acomposite+top_hitsaggregation:Dedup pushdown is restored when a
whereclause precedesdedup. All 11 existingCalcitePPLDedupTestcases continue to pass.Related Issues
Resolves #5482
Check List
--signoffor-s.This pull request was authored by @ryan-gh-bot in response to the maintainer command
/triageon #5482 (invoked by @RyanL1997). This is a maintainer-triggered automated contribution. Please review carefully before merging.