Fix #5178: SUM of all-null column returns NULL per SQL standard#5189
Fix #5178: SUM of all-null column returns NULL per SQL standard#5189opensearchpplteam wants to merge 3 commits intoopensearch-project:mainfrom
Conversation
…ead of 0 Replace OpenSearch sum aggregation with stats aggregation for SUM pushdown. The stats aggregation exposes both sum and count, allowing detection of all-null input (count == 0) to return null per the SQL standard. Added SumStatsParser to handle the stats response parsing. Signed-off-by: Claude <noreply@anthropic.com> Signed-off-by: EC2 Default User <ec2-user@ip-172-31-19-129.us-west-2.compute.internal>
PR Reviewer Guide 🔍(Review updated until commit c1e2833)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to c1e2833 Explore these optional code suggestions:
Previous suggestionsSuggestions up to commit e5560ec
Suggestions up to commit cc21a76
|
- Fix stats aggregation ordering: when sorting on a stats (multi-value) aggregation, append .sum to the order path so OpenSearch can resolve the correct sub-metric - Update test expectations: SUM of all-null columns now correctly returns null instead of 0 in all integration tests - Update explain YAML expected outputs to reflect the .sum order path Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: EC2 Default User <ec2-user@ip-172-31-19-129.us-west-2.compute.internal>
|
Persistent review updated to latest commit e5560ec |
Signed-off-by: EC2 Default User <ec2-user@ip-172-31-19-129.us-west-2.compute.internal>
|
Persistent review updated to latest commit c1e2833 |
| composite.getSubAggregations().stream() | ||
| .filter(sub -> sub.getName().equals(aggName)) | ||
| .findFirst() | ||
| .orElse(null); |
There was a problem hiding this comment.
The resolveOrderPath method uses .orElse(null) when looking up the sub-aggregation. If the aggregation name is not found (which should not happen in normal flow), the method silently returns aggName without the .sum suffix rather than failing fast. Consider using .orElseThrow() to surface unexpected state earlier, or add a brief comment explaining why the fallback is intentional.
|
This is opensearch issue. opensearch-project/OpenSearch#17581. |
|
Thanks for flagging this, @penghuo. You're right that the root cause is in OpenSearch's This PR provides a SQL-plugin-level workaround so that |
Description
Root Cause: OpenSearch's
sumaggregation returns0.0when all input values are null, which violates the SQL standard that requiresSUMof an all-null column to returnNULL.Fix: Replace the
sumaggregation withstatsaggregation for SUM pushdown. Thestatsaggregation provides bothsumandcountfields. Whencount == 0, all input values were null, so the parser returnsnullinstead of0.0. This is implemented in a newSumStatsParserclass that follows the existingMetricParserpattern.Both the Calcite code path (
AggregateAnalyzer) and the legacy code path (MetricAggregationBuilder) are updated consistently.Related Issues
#5178
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.