Bigquery: Support DISTINCT / ORDER BY / LIMIT inside ARRAY_AGG and STRING_AGG#2660
Open
hbridge wants to merge 1 commit into
Open
Bigquery: Support DISTINCT / ORDER BY / LIMIT inside ARRAY_AGG and STRING_AGG#2660hbridge wants to merge 1 commit into
hbridge wants to merge 1 commit into
Conversation
…nd STRING_AGG Adds `array_agg_arg` rule to the BigQuery grammar that accepts the full GoogleSQL aggregate-modifier syntax: [DISTINCT] expr [ORDER BY …] [LIMIT n] Previously `aggr_array_agg` used a bare `expr` match, so all three modifier forms threw a parse error. The new rule mirrors the PostgreSQL `distinct_args` pattern already present in this repo, reuses the existing `count_arg` structure (DISTINCT + or_and_expr + order_by_clause?), and adds an optional `limit_clause?` as the BigQuery-specific piece. `aggrToSQL` gains LIMIT serialisation (`args.limit → limitToSQL`) so the AST round-trips correctly. The pre-existing "struct expr" test expectation is updated from the broken `ARRAY_AGG(undefined)` to the now-correctly-parsed `ARRAY_AGG(STRUCT(…)LIMIT 3)` output. Twelve new tests (round-trip + AST shape) cover DISTINCT, LIMIT, ORDER BY, and all three combined for both ARRAY_AGG and STRING_AGG.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The BigQuery grammar rejected all three GoogleSQL aggregate modifiers for
ARRAY_AGGandSTRING_AGG:These are all valid per the GoogleSQL
ARRAY_AGGdocs:Root cause
aggr_array_agginpegjs/bigquery.pegjsmatched the argument as a bareexpr. Any modifier before the closing)left unconsumed tokens and caused a parse error.Fix
One rule change in
pegjs/bigquery.pegjs: added anarray_agg_argrule that accepts the full modifier syntax, then wiredaggr_array_aggto use it.The new rule:
count_arg'sKW_DISTINCT? / or_and_expr / order_by_clause?pattern is already tested and correctdistinct_argsprecedent already inpegjs/postgresql.pegjslimit_clause?as the BigQuery-specific piece (LIMIT inside an aggregate is GoogleSQL-only){ distinct, expr, orderby, limit }— same shape ascount_arg, soaggrToSQLneeds only one new linesrc/aggregation.jsgains one line:if (args.limit) str = [str, limitToSQL(args.limit)].join(' ').Tests
Added to
test/bigquery.spec.js:ARRAY_AGGandSTRING_AGG)args.distinct,args.orderby, andargs.limitare populated correctlyARRAY_AGG(undefined)(args were unreachable before) to the now-correctly-parsed outputFull suite: 1584 passing, 1 pending, 0 regressions.