From daa59fa09d859b515ee513933a5f41a03155285c Mon Sep 17 00:00:00 2001
From: javier <javier@formatinternet.com>
Date: Tue, 24 Feb 2026 18:20:38 +0100
Subject: [PATCH 1/4] Add array_agg() aggregate function documentation

---
 documentation/query/functions/aggregation.md | 131 +++++++++++++++++++
 documentation/query/functions/array.md       |   6 +-
 2 files changed, 135 insertions(+), 2 deletions(-)

diff --git a/documentation/query/functions/aggregation.md b/documentation/query/functions/aggregation.md
index 383a646a3..745170c6b 100644
--- a/documentation/query/functions/aggregation.md
+++ b/documentation/query/functions/aggregation.md
@@ -74,6 +74,12 @@ calculations. Functions are organized by category below.
 | [bit_or](#bit_or) | Bitwise OR of all non-NULL values |
 | [bit_xor](#bit_xor) | Bitwise XOR of all non-NULL values |
 
+### Array aggregates
+
+| Function | Description |
+| :------- | :---------- |
+| [array_agg](#array_agg) | Collect values or concatenate arrays into a DOUBLE[] |
+
 ### Specialized aggregates
 
 | Function | Description |
@@ -408,6 +414,131 @@ FROM weather_data;
 - [min](#min) - Returns the minimum value itself
 - [first](#first) - Returns the first value by timestamp order
 
+## array_agg
+
+`array_agg(value)` or `array_agg(array)` collects row values into a single
+`DOUBLE[]` array per group. Use it to build per-group vectors for downstream
+array operations such as
+[array_avg](/docs/query/functions/array/#array_avg),
+[dot_product](/docs/query/functions/array/#dot_product), or time-bucketed
+snapshots via SAMPLE BY.
+
+When called with a scalar argument, each row's value becomes one element of the
+output array. When called with a `DOUBLE[]` argument, non-null input arrays are
+concatenated into a single flat `DOUBLE[]`.
+
+#### Parameters
+
+| Parameter | Type | Description |
+| :-------- | :--- | :---------- |
+| `value` | `DOUBLE` (or castable) | Scalar to collect. Each row's value becomes one element. NULL values are included as NaN. |
+| `array` | `DOUBLE[]` | Array to concatenate. Non-null arrays are appended into a flat `DOUBLE[]`. NULL and empty arrays are skipped. |
+| `ordered` | `BOOLEAN` (optional, default `true`) | `true` preserves input order. `false` enables parallel execution, but output order is non-deterministic. |
+
+Only one of `value` or `array` is used per call. The `ordered` parameter can be
+added as a second argument to either form.
+
+#### Return type
+
+Always `DOUBLE[]`. Returns NULL when no rows match or all inputs are NULL.
+
+#### SAMPLE BY support
+
+`array_agg` works with SAMPLE BY and the following FILL options:
+
+- `FILL(NULL)` - empty buckets produce NULL
+- `FILL(PREV)` - empty buckets repeat the previous bucket's array
+- `FILL(NONE)` - empty buckets are omitted from the result
+
+Not supported:
+
+- `FILL(LINEAR)` - interpolation is meaningless for arrays
+- `FILL(constant)` - a scalar cannot fill an array column; use `FILL(NULL)` instead
+
+#### Examples
+
+**Collect prices per symbol:**
+
+```questdb-sql demo title="array_agg - collect prices per symbol"
+SELECT symbol, array_agg(price) AS prices
+FROM fx_trades
+WHERE symbol = 'EURUSD'
+  AND timestamp IN '$now - 3s..$now'
+GROUP BY symbol;
+```
+
+Each row's `price` value is collected into a single `DOUBLE[]` per symbol.
+
+**Time-bucketed arrays with SAMPLE BY:**
+
+```questdb-sql demo title="array_agg - time-bucketed price arrays"
+SELECT timestamp, array_agg(price) AS prices
+FROM trades
+WHERE symbol = 'BTC-USDT'
+  AND timestamp IN '$now - 5s..$now'
+SAMPLE BY 1s;
+```
+
+Produces one array per 1-second bucket containing all trade prices in that
+interval.
+
+**Parallel execution for large datasets:**
+
+```questdb-sql title="array_agg - parallel (unordered) execution"
+SELECT timestamp, array_agg(price, false) AS prices
+FROM trades
+WHERE symbol = 'BTC-USDT'
+  AND timestamp IN '$now - 5s..$now'
+SAMPLE BY 1s;
+```
+
+Passing `false` as the second argument drops the ordering guarantee, allowing
+QuestDB to execute the aggregation in parallel across threads. The array
+elements are the same but their order within each bucket is non-deterministic.
+Use this when element order does not matter and throughput is the priority.
+
+**Concatenate arrays within a time bucket:**
+
+```questdb-sql demo title="array_agg - concatenate bid price arrays"
+SELECT timestamp, array_agg(bids[1]) AS all_bids
+FROM market_data
+WHERE symbol = 'EURUSD'
+  AND timestamp IN '$now - 1s..$now'
+SAMPLE BY 100ms;
+```
+
+Here `bids[1]` is a `DOUBLE[]` column (bid prices at each depth level). The
+function concatenates all non-null bid-price arrays in each 100 ms bucket into a
+single flat array.
+
+**Compose with array functions:**
+
+```questdb-sql demo title="array_agg - cumulative price sums per symbol"
+SELECT symbol,
+  array_cum_sum(array_agg(price)) AS cumulative_prices
+FROM fx_trades
+WHERE symbol = 'EURUSD'
+  AND timestamp IN '$now - 3s..$now'
+GROUP BY symbol;
+```
+
+[array_cum_sum](/docs/query/functions/array/#array_cum_sum) computes running
+totals over the collected prices in timestamp order. This has no scalar aggregate
+equivalent and is useful for intra-group cumulative analysis.
+
+#### Constraints and edge cases
+
+- NULL scalar values become NaN elements in the output array.
+- NULL and empty input arrays are skipped during concatenation, but NULL
+  elements within a non-null array are preserved.
+- Returns NULL (not an empty array) when no rows match the filter or all inputs
+  are NULL.
+
+#### See also
+
+- [Array functions](/docs/query/functions/array/) - Functions that operate on arrays
+- [SAMPLE BY](/docs/query/sql/sample-by/) - Time-series aggregation
+
 ## avg
 
 `avg(value)` calculates simple average of values ignoring missing data (e.g
diff --git a/documentation/query/functions/array.md b/documentation/query/functions/array.md
index 4de2b6580..c0304539b 100644
--- a/documentation/query/functions/array.md
+++ b/documentation/query/functions/array.md
@@ -6,8 +6,10 @@ description: Array functions reference documentation.
 
 This page documents functions for n-dimensional arrays. This isn't an exhaustive
 list of all functions that may take an array parameter. For example, financial
-functions are listed in [their own section](/docs/query/functions/finance/), whether or
-not they can take an array parameter.
+functions are listed in [their own section](/docs/query/functions/finance/),
+whether or not they can take an array parameter. To collect row values into
+arrays during GROUP BY or SAMPLE BY, see
+[array aggregates](/docs/query/functions/aggregation/#array-aggregates).
 
 ## array_avg
 

From 9ebdd26187a5311710d12e7211238039ee7afd29 Mon Sep 17 00:00:00 2001
From: javier <javier@formatinternet.com>
Date: Tue, 24 Feb 2026 18:22:09 +0100
Subject: [PATCH 2/4] Show prices alongside cumulative_prices in compose
 example

---
 documentation/query/functions/aggregation.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/documentation/query/functions/aggregation.md b/documentation/query/functions/aggregation.md
index 745170c6b..5ed859de6 100644
--- a/documentation/query/functions/aggregation.md
+++ b/documentation/query/functions/aggregation.md
@@ -515,6 +515,7 @@ single flat array.
 
 ```questdb-sql demo title="array_agg - cumulative price sums per symbol"
 SELECT symbol,
+  array_agg(price) AS prices,
   array_cum_sum(array_agg(price)) AS cumulative_prices
 FROM fx_trades
 WHERE symbol = 'EURUSD'

From f1606efba410acee6370be66d15433fbf683310f Mon Sep 17 00:00:00 2001
From: javier <javier@formatinternet.com>
Date: Tue, 24 Feb 2026 18:22:48 +0100
Subject: [PATCH 3/4] Remove incorrect claim about no scalar equivalent

---
 documentation/query/functions/aggregation.md | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/documentation/query/functions/aggregation.md b/documentation/query/functions/aggregation.md
index 5ed859de6..162b1477f 100644
--- a/documentation/query/functions/aggregation.md
+++ b/documentation/query/functions/aggregation.md
@@ -524,8 +524,7 @@ GROUP BY symbol;
 ```
 
 [array_cum_sum](/docs/query/functions/array/#array_cum_sum) computes running
-totals over the collected prices in timestamp order. This has no scalar aggregate
-equivalent and is useful for intra-group cumulative analysis.
+totals over the collected prices in timestamp order.
 
 #### Constraints and edge cases
 

From 11e987a890c9558b02adb7d4f9b3cd67d5a740a0 Mon Sep 17 00:00:00 2001
From: javier <javier@formatinternet.com>
Date: Tue, 24 Feb 2026 18:55:17 +0100
Subject: [PATCH 4/4] Improve array_agg docs for accuracy and consistency

Ref: https://github.com/questdb/questdb/pull/6812

- Add explicit syntax signatures for both scalar and array forms
- Fix return value semantics: scalar form always produces an array when
  rows exist, array form returns NULL when all inputs are NULL/empty
- Rename "Return type" to "Return value" to match page conventions
- Split NULL handling into its own section with per-form behavior
- Add result table with real output to first example
- Remove internal NaN implementation details from user-facing docs
- Add backticks to SAMPLE BY references
---
 documentation/query/functions/aggregation.md | 39 +++++++++++++-------
 1 file changed, 26 insertions(+), 13 deletions(-)

diff --git a/documentation/query/functions/aggregation.md b/documentation/query/functions/aggregation.md
index 162b1477f..a1a3700ea 100644
--- a/documentation/query/functions/aggregation.md
+++ b/documentation/query/functions/aggregation.md
@@ -416,12 +416,18 @@ FROM weather_data;
 
 ## array_agg
 
-`array_agg(value)` or `array_agg(array)` collects row values into a single
-`DOUBLE[]` array per group. Use it to build per-group vectors for downstream
-array operations such as
+**Syntax:**
+
+```questdb-sql
+array_agg(value [, ordered]) -> DOUBLE[]
+array_agg(array [, ordered]) -> DOUBLE[]
+```
+
+Collects row values into a single `DOUBLE[]` array per group. Use it to build
+per-group vectors for downstream array operations such as
 [array_avg](/docs/query/functions/array/#array_avg),
 [dot_product](/docs/query/functions/array/#dot_product), or time-bucketed
-snapshots via SAMPLE BY.
+snapshots via `SAMPLE BY`.
 
 When called with a scalar argument, each row's value becomes one element of the
 output array. When called with a `DOUBLE[]` argument, non-null input arrays are
@@ -431,20 +437,25 @@ concatenated into a single flat `DOUBLE[]`.
 
 | Parameter | Type | Description |
 | :-------- | :--- | :---------- |
-| `value` | `DOUBLE` (or castable) | Scalar to collect. Each row's value becomes one element. NULL values are included as NaN. |
+| `value` | `DOUBLE` (or castable) | Scalar to collect. Each row's value becomes one element. NULL values are preserved. |
 | `array` | `DOUBLE[]` | Array to concatenate. Non-null arrays are appended into a flat `DOUBLE[]`. NULL and empty arrays are skipped. |
 | `ordered` | `BOOLEAN` (optional, default `true`) | `true` preserves input order. `false` enables parallel execution, but output order is non-deterministic. |
 
 Only one of `value` or `array` is used per call. The `ordered` parameter can be
 added as a second argument to either form.
 
-#### Return type
+#### Return value
 
-Always `DOUBLE[]`. Returns NULL when no rows match or all inputs are NULL.
+- **Scalar form** - `DOUBLE[]`. Always produces an array when at least one row
+  exists, even if every value is NULL. Empty `SAMPLE BY` buckets produce NULL via
+  `FILL(NULL)`.
+- **Array form** - `DOUBLE[]`. Returns NULL (not an empty array) when every
+  input array is NULL or empty. Empty `SAMPLE BY` buckets produce NULL via
+  `FILL(NULL)`.
 
 #### SAMPLE BY support
 
-`array_agg` works with SAMPLE BY and the following FILL options:
+`array_agg` works with `SAMPLE BY` and the following FILL options:
 
 - `FILL(NULL)` - empty buckets produce NULL
 - `FILL(PREV)` - empty buckets repeat the previous bucket's array
@@ -467,6 +478,10 @@ WHERE symbol = 'EURUSD'
 GROUP BY symbol;
 ```
 
+| symbol | prices |
+| :----- | :----- |
+| EURUSD | [1.1922, 1.1928, 1.1925, 1.1927, 1.1927, 1.1926, 1.1924, 1.1926, 1.1932, 1.1934, 1.1933, 1.1928, 1.1929, 1.1934, 1.1935, 1.1934, 1.1935, 1.1929, 1.193, ...] |
+
 Each row's `price` value is collected into a single `DOUBLE[]` per symbol.
 
 **Time-bucketed arrays with SAMPLE BY:**
@@ -526,13 +541,11 @@ GROUP BY symbol;
 [array_cum_sum](/docs/query/functions/array/#array_cum_sum) computes running
 totals over the collected prices in timestamp order.
 
-#### Constraints and edge cases
+#### NULL handling
 
-- NULL scalar values become NaN elements in the output array.
-- NULL and empty input arrays are skipped during concatenation, but NULL
+- NULL scalar inputs appear as null elements in the output array.
+- NULL and empty input arrays are skipped during concatenation, but null
   elements within a non-null array are preserved.
-- Returns NULL (not an empty array) when no rows match the filter or all inputs
-  are NULL.
 
 #### See also