Fix transpose VARCHAR length drift in unpivot/pivot lowering by songkant-aws · Pull Request #5479 · opensearch-project/sql

songkant-aws · 2026-05-28T07:17:43Z

Summary

PPL transpose lowers via RelBuilder.unpivot() + pivot(). Calcite's unpivot synthesizes a LogicalValues leaf carrying axis literals (the input field names), and types them as CHAR(maxAxisLiteralLen) — the longest literal wins on column-level type inference. That precision propagates through the rest of the lowering and breaks the analytics-engine route end-to-end.

This PR makes the lowering own its type contract:

Build every axis literal at the same CHAR(maxAxisLiteralLen) type. Calcite then space-pads the shorter literals at value-construction time, so the runtime CHAR vector and the declared schema both have the same fixed length. The downstream TRIM strips the padding.
Wrap the trimmed-axis group key in an explicit CAST(... AS VARCHAR) to unbounded VARCHAR. This makes the group key type match _value_transpose_'s unbounded VARCHAR end-to-end, so the aggregate's row-type check sees consistent types.

Why this has to be in sql plugin (not analytics-engine planner)

The typing decisions happen inside Calcite's RelBuilder.unpivot() when it constructs the VALUES leaf — long before any analytics-engine rule sees the plan. By the time the plan reaches the analytics-engine route, the precision drift is already baked into the RelDataType chain. Fixing it downstream would require pattern-matching on transpose-shaped sub-trees inside the planner, which is fragile and mis-attributes the root cause. The lowering author owns the type contract for the operators it emits.

What goes wrong on the analytics-engine route without this fix

Concrete plan from source=accounts | head N | fields firstname, age, balance | transpose:

LogicalAggregate(group=[{1}], row 1_null=[MAX($0) FILTER $2], ...)
  LogicalProject(_value_transpose_=[CAST($5):VARCHAR NOT NULL],   -- $0
                 $f6=[TRIM(BOTH, ' ', $4)],                       -- $1  ← group key
                 $f7=[=($3, 1)], ...)                             -- $2..
    LogicalProject(... column=$4, _value_transpose_=[CASE(...)])
      LogicalJoin(condition=[true])                               -- cross join from pivot
        LogicalProject(..., ROW_NUMBER() OVER ())
          LogicalSort(fetch=N)
            LogicalScan(t1)
        LogicalValues(tuples=[['firstname'},{'age'},{'balance'}]) -- axis literals

Type chain:

column (group key): CHAR(9) from longest literal 'firstname' → after TRIM → VARCHAR(9)
_value_transpose_: CAST(... AS VARCHAR) — unbounded
MAX(_value_transpose_).type: unbounded VARCHAR (inferred from arg 0 at construction)

Two failure modes downstream:

Aggregate.<init> rejects the plan. Non-prefix groupSet={1} aggregates split into PARTIAL+FINAL on the analytics path. PARTIAL hoists group keys to its output prefix, so FINAL's argList=[0] ends up reading the group-key slot (VARCHAR(9)) instead of the agg-state slot. Calcite's typeMatchesInferred check then fires:
```
type mismatch:
  aggCall type:    VARCHAR
  inferred type:   VARCHAR(9)
```
Substrait/Arrow FixedChar length mismatch at execution time. Even if the aggregate validation passes, the CHAR(maxAxisLiteralLen) propagates through isthmus → DataFusion as a FixedChar(maxAxisLiteralLen) schema. Runtime arrays carry the actual values whose lengths are shorter (e.g. "age" is length 3, not 9), so DataFusion rejects:
```
Row field type (FixedChar{length=3}) does not match schema field type (FixedChar{length=9})
```

Both failures trace back to the same root cause — uneven typing in the unpivot/pivot lowering — and disappear once the lowering emits consistent types.

What changes

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java — visitTranspose:

+    int axisLiteralLength = fieldNames.stream().mapToInt(String::length).max().orElse(0);
+    RelDataType axisLiteralType =
+        rx.getTypeFactory().createSqlType(SqlTypeName.CHAR, axisLiteralLength);
     // unpivot axisMap entries:
-                        ImmutableList.of(rx.makeLiteral(f)),
+                        ImmutableList.of(
+                            (RexLiteral) rx.makeLiteral(f, axisLiteralType, false, false)),
     // group key for pivot:
-    RexNode trimmedColumnName =
-        context.rexBuilder.makeCall(SqlStdOperatorTable.TRIM, ...);
+    RexNode trimmedColumnName =
+        context.rexBuilder.makeCast(
+            varchar,
+            context.rexBuilder.makeCall(SqlStdOperatorTable.TRIM, ...),
+            true);

Tests

CalcitePPLTransposeTest — added testTransposeColumnAxisUsesUnboundedVarchar regression assertion that pins the output column field's type to unbounded VARCHAR. This catches any future change that re-introduces axis-literal precision into the group key.
Updated plan-shape assertions across the existing transpose tests to reflect:
- Padded axis literals ('cnt ', 'COMM ', 'JOB ', 'SAL ', etc.)
- CAST(TRIM(...) AS VARCHAR) shape on the group key
- SparkSQL output GROUP BY CAST(TRIM(\column`) AS STRING)`

End-to-end verification: CalciteTransposeCommandIT 5/5 pass with -Dtests.analytics.parquet_indices=true.

Test plan

Unit: ./gradlew :ppl:test --tests "*CalcitePPLTransposeTest"
Spotless: ./gradlew spotlessCheck
V2 engine path (non-parquet): no regression — unchanged plan shape produces identical results
Analytics-engine path (parquet): CalciteTransposeCommandIT 5/5 (vs 0/5 before)

PPL `transpose` lowers via `RelBuilder.unpivot()` + `pivot()`. The unpivot synthesizes a VALUES leaf carrying axis literals (the input field names), e.g. `VALUES('firstname'), ('age'), ('balance')`. Calcite types each RexLiteral as `CHAR(literalLen)` and types the VALUES column as `CHAR(maxAxisLiteralLen)` — the longest literal wins on column-level type inference. This bites the analytics-engine route end-to-end: 1. After unpivot the `column` axis column is `CHAR(9)` (from "firstname"). Through Calcite's TRIM (`TO_VARYING`) it becomes `VARCHAR(9)`. 2. The `_value_transpose_` value column is built from `CAST(input_field AS VARCHAR)` — unbounded VARCHAR. 3. `MAX(_value_transpose_)` aggCall is created with declared return type = unbounded VARCHAR (inferred from arg 0 at call-construction time). 4. The downstream non-prefix groupSet aggregate (`group=[{1}], MAX($0)`) splits into PARTIAL/FINAL on the analytics path. PARTIAL hoists group keys to the output prefix, so FINAL's `argList=[0]` reads the group-key slot — `VARCHAR(9)` — instead of the agg-state slot. Calcite's `Aggregate.<init>` then runs `typeMatchesInferred` and rejects the plan: declared `VARCHAR` ≠ inferred `VARCHAR(9)`. 5. Even when the aggregate validation passes, the substrait/Arrow path sees `FixedChar(maxAxisLiteralLen)` schema vs runtime arrays whose actual values are shorter (e.g. "age" with length 3) and trips `Row field type (FixedChar{length=3}) does not match schema field type (FixedChar{length=9})`. Two fixes, both in the lowering site: * Build every axis literal at the same `CHAR(maxAxisLiteralLen)` type. Calcite then space-pads the shorter literals at value-construction time, so the runtime CHAR vector and the declared schema both have the same fixed length. The downstream TRIM strips the padding. * Wrap the trimmed-axis group key in an explicit `CAST(... AS VARCHAR)` to unbounded VARCHAR. This makes the group key type match `_value_transpose_`'s unbounded VARCHAR end-to-end, so the aggregate's row-type check sees consistent types regardless of which side the analytics-engine split rule places the group key on. These have to live in sql plugin, not in the analytics-engine planner: the typing decisions are made by Calcite's `RelBuilder.unpivot()` implementation when it constructs the VALUES leaf — long before any analytics-engine rule sees the plan. By the time the plan reaches the analytics-engine route, the precision drift is already baked into the RelDataType chain. Fixing it downstream would require pattern-matching on transpose-shaped sub-trees inside the planner, which is fragile and mis-attributes the root cause. The lowering author owns the type contract for the operators it emits. Adds: - `testTransposeColumnAxisUsesUnboundedVarchar` regression assertion pinning the output `column` field's type to unbounded VARCHAR. Catches any future change that re-introduces axis-literal precision into the group key. - Updated plan-shape assertions across the existing transpose tests to reflect the padded axis literals (`'cnt '`, `'COMM '`, etc.) and the `CAST(TRIM(...) AS VARCHAR)` group key. Verified end-to-end: `CalciteTransposeCommandIT` 5/5 pass with `tests.analytics.parquet_indices=true`. Signed-off-by: Songkan Tang <songkant@amazon.com>

github-actions · 2026-05-28T07:19:17Z

PR Reviewer Guide 🔍

(Review updated until commit `8b89df5`)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ Recommended focus areas for review Possible Issue When `fieldNames` is empty, `axisLiteralLength` becomes 0, and `axisLiteralType` is `CHAR(0)`. Calcite may reject zero-length CHAR types or produce unexpected behavior. This occurs if all fields are metadata fields or if the input has no fields at all. int axisLiteralLength = fieldNames.stream().mapToInt(String::length).max().orElse(0); RelDataType axisLiteralType = rx.getTypeFactory().createSqlType(SqlTypeName.CHAR, axisLiteralLength);

github-actions · 2026-05-28T07:19:40Z

PR Code Suggestions ✨

Latest suggestions up to 8b89df5
Explore these optional code suggestions:

Category	Suggestion	Impact
General	Handle empty fieldNames case The `axisLiteralLength` calculation may result in 0 when `fieldNames` is empty, which could cause issues when creating a CHAR type with length 0. Consider adding validation to ensure a minimum length of 1, or handle the empty case explicitly to prevent potential type creation errors. core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [916-918] -int axisLiteralLength = fieldNames.stream().mapToInt(String::length).max().orElse(0); +int axisLiteralLength = fieldNames.stream().mapToInt(String::length).max().orElse(1); RelDataType axisLiteralType = rx.getTypeFactory().createSqlType(SqlTypeName.CHAR, axisLiteralLength); Suggestion importance[1-10]: 3 __ Why: While the suggestion addresses a potential edge case where `fieldNames` is empty, this scenario is unlikely in the context of the transpose operation which requires at least one field. The change from `orElse(0)` to `orElse(1)` is a minor defensive improvement but doesn't address a critical bug.	Low

Previous suggestions

Suggestions up to commit 35bd92b

Category	Suggestion	Impact
General	Use VARCHAR instead of CHAR type Using `CHAR` type with fixed length can cause issues when field names have varying lengths, as shorter names will be padded with spaces. This padding is why the TRIM operation is needed later. Consider using `VARCHAR` with the maximum length instead to avoid unnecessary padding and simplify the logic. core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [916-918] int axisLiteralLength = fieldNames.stream().mapToInt(String::length).max().orElse(0); RelDataType axisLiteralType = - rx.getTypeFactory().createSqlType(SqlTypeName.CHAR, axisLiteralLength); + rx.getTypeFactory().createSqlType(SqlTypeName.VARCHAR, axisLiteralLength); Suggestion importance[1-10]: 9 __ Why: The suggestion correctly identifies that using `CHAR` type causes padding with spaces for shorter field names, which necessitates the `TRIM` operation added in the PR. Using `VARCHAR` would eliminate this padding issue and simplify the logic, making it a high-impact improvement.	High
General	Handle empty field names list safely When `fieldNames` is empty, `orElse(0)` returns 0, which could lead to creating a type with zero length. This may cause issues with type validation or downstream operations. Ensure a minimum length of 1 to prevent potential errors. core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [916] -int axisLiteralLength = fieldNames.stream().mapToInt(String::length).max().orElse(0); +int axisLiteralLength = fieldNames.stream().mapToInt(String::length).max().orElse(1); Suggestion importance[1-10]: 7 __ Why: The suggestion addresses a potential edge case where `fieldNames` is empty, which would result in a zero-length type. While this is a valid concern for robustness, the likelihood of this scenario occurring in practice may be low, and the impact is moderate.	Medium

CalciteExplainIT.testTransposeExplain regenerated against the updated transpose lowering: axis literals are now padded to a uniform CHAR(N) (N = max axis literal length, i.e. 14 for 'account_number'), and the group-key TRIM output is wrapped in a CAST(... AS VARCHAR) to unbounded VARCHAR. The plan-shape diff exactly mirrors the documented behavior change in the parent commit: * `$f20=[TRIM(...)]` → `$f20=[CAST(TRIM(...)):VARCHAR NOT NULL]` * axis literals e.g. 'firstname' → 'firstname ' (padded) * LogicalValues row type tuples are correspondingly padded Verified locally: `./gradlew :integ-test:integTestRemote --tests "*CalciteExplainIT.testTransposeExplain"` passes. Signed-off-by: Songkan Tang <songkant@amazon.com>

github-actions · 2026-05-28T08:13:10Z

Persistent review updated to latest commit 8b89df5

songkant-aws requested review from LantaoJin, RyanL1997, Swiddis, acarbonetto, ahkcs, anirudha, dai-chen, joshuali925, mengweieric, noCharger, penghuo, ps48, qianheng-aws, vamsimanohar, ykmr1224 and yuancu as code owners May 28, 2026 07:17

songkant-aws added PPL Piped processing language calcite calcite migration releated maintenance Improves code quality, but not the product labels May 28, 2026

RyanL1997 approved these changes May 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix transpose VARCHAR length drift in unpivot/pivot lowering#5479

Fix transpose VARCHAR length drift in unpivot/pivot lowering#5479
songkant-aws wants to merge 2 commits into
opensearch-project:mainfrom
songkant-aws:transpose-varchar-fix

songkant-aws commented May 28, 2026

Uh oh!

github-actions Bot commented May 28, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 28, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

songkant-aws commented May 28, 2026

Summary

Why this has to be in sql plugin (not analytics-engine planner)

What goes wrong on the analytics-engine route without this fix

What changes

Tests

Test plan

Uh oh!

github-actions Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Reviewer Guide 🔍

(Review updated until commit 8b89df5)

Uh oh!

github-actions Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Previous suggestions

Uh oh!

github-actions Bot commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented May 28, 2026 •

edited

Loading

(Review updated until commit `8b89df5`)

github-actions Bot commented May 28, 2026 •

edited

Loading