Explicit head command should override fetch_size in PPL#5194
Explicit head command should override fetch_size in PPL#5194ahkcs wants to merge 5 commits intoopensearch-project:mainfrom
Conversation
PR Reviewer Guide 🔍(Review updated until commit a6e1216)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to a6e1216 Explore these optional code suggestions:
Previous suggestionsSuggestions up to commit e62de53
Suggestions up to commit adea087
Suggestions up to commit 7f797a9
Suggestions up to commit ff1fd32
Suggestions up to commit d2b35ee
|
When a user's PPL query contains an explicit head command, the fetch_size parameter should not inject an additional Head node on top. Previously, fetch_size always wrapped the plan with Head(fetchSize), creating a double-Head that could cap results unexpectedly (e.g., head 100 with fetch_size=5 would only return 5 rows). This adds a containsHead() check in AstStatementBuilder to skip the Head injection when the user's query already has a head command, letting the user's explicit intent take precedence. Signed-off-by: Kai Huang <ahkcs@amazon.com>
d2b35ee to
ff1fd32
Compare
|
Persistent review updated to latest commit ff1fd32 |
Remove the outer LogicalSort(fetch=[fetchSize]) from expected explain plans when an explicit head command is present, since fetch_size no longer injects a Head node in that case. Signed-off-by: Kai Huang <ahkcs@amazon.com>
|
Persistent review updated to latest commit 7f797a9 |
| * Recursively checks if the AST contains a {@link Head} node. When the user's query already | ||
| * includes an explicit {@code head} command, we should not inject an additional Head for | ||
| * fetch_size so that the user's explicit limit takes precedence. |
There was a problem hiding this comment.
So if a query ends with ... | head 3 | head 500, the limit in final physical plan is Limit 500?
There was a problem hiding this comment.
It should be 3, added a corresponding test case
There was a problem hiding this comment.
So if a query ends with
... | head 3 | head 500, the limit in final physical plan is Limit 500?
The new logic only decides if fetch_size will inject a new head node or not, so it won't affect the original head command behavior(which is that the smaller head command will take effect)
There was a problem hiding this comment.
Sorry I meant without current change, it will be LIMIT 500? If so, is there bug elsewhere or this is Calcite planner expected behavior?
There was a problem hiding this comment.
Here's an example physical plan:
https://github.com/opensearch-project/sql/pull/5194/changes#diff-0c3c28474cebca090844af02cd01618a9ec578f69a272774934c7b3fb1c020aeR8
If without the current change, it will be LIMIT 3 in the final OpenSearchRequestBuilder(sourceBuilder
I think it's the expected behavior
Verifies that when a query has multiple heads (e.g., head 3 | head 500), fetch_size does not inject yet another Head node. The existing Head nodes are preserved as-is, with the inner head 3 limiting first. Signed-off-by: Kai Huang <ahkcs@amazon.com>
|
Persistent review updated to latest commit adea087 |
|
Persistent review updated to latest commit e62de53 |
Signed-off-by: ahkcs <austinhkcs@gmail.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>
|
Persistent review updated to latest commit a6e1216 |
Summary
Fixes bug 2 in #5191
When a user's PPL query contains an explicit
headcommand, thefetch_sizeAPI parameter should not inject an additionalHeadnode on top. Previously,AstStatementBuilder.visitPplStatement()always wrapped the plan withHead(fetchSize), creating a double-Head that capped results unexpectedly.Example:
source=index | head 100withfetch_size=5would only return 5 rows instead of the expected 100 rows, because the outerHead(5)silently overrode the user's explicithead 100.Test plan
head 5+fetch_size=10→ 5 rows ✓head 8+fetch_size=3→ 8 rows ✓ (head overrides fetch_size)fetch_size=3(no head) → 3 rows ✓ (fetch_size still works when no head)