feat: Implement PPL convert command with 5 conversion functions by ritvibhatt · Pull Request #5157 · opensearch-project/sql

ritvibhatt · 2026-02-19T00:11:29Z

Description

Copy of #5034, resolves merge conflicts

Related Issues

Resolves partially (only numeric and memory conversions in this PR) #5001

Check List

New functionality includes testing.
New functionality has been documented.
New functionality has javadoc added.
New functionality has a user manual doc added.
New PPL command checklist all confirmed.
API changes companion pull request created.
Commits are signed per the DCO using --signoff or -s.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

- Added convert command syntax and AST nodes (Convert, ConvertFunction) - Implemented 5 conversion functions: auto, num, rmcomma, rmunit, none - Full Calcite pushdown support via CalciteRelNodeVisitor - Logical and physical operators (LogicalConvert, ConvertOperator) - Comprehensive test coverage (26 tests total): * 13 unit tests (CalcitePPLConvertTest) * 8 integration tests with pushdown (ConvertCommandIT) * 8 non-pushdown tests (CalciteConvertCommandIT) * 3 explain tests (ExplainIT) * 2 cross-cluster tests (CrossClusterSearchIT) * Anonymizer and v2 unsupported tests - User documentation (docs/user/ppl/cmd/convert.md) - Code cleanup: removed dead code, simplified javadocs - Version: 3.5 (experimental) All tests compile and pass successfully. Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Centralized common conversion function logic into BaseConversionUDF base class, reducing code from 219 to 162 lines (26% reduction) while maintaining all functionality. Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

…d separating concerns into each respective function class Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Replace ConvertFunction class with standard Let expressions to handle none() at AST level and reduce code complexity. Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

…tCommand Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Add testNumConvertNullAndEmpty() to ensure NumConvertFunction.convert() properly returns null for null, empty string, and whitespace-only inputs, matching the coverage pattern of other conversion functions. Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Add testRmunitConvertNullAndEmpty() to ensure RmunitConvertFunction.convert() properly returns null for null, empty string, and whitespace-only inputs, completing the test coverage pattern across all conversion functions. Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Update testConvertWithAlias to verify that using AS clause preserves the original field alongside the converted field. Test now checks both balance (bigint) and balance_num (double) are present in results. Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

coderabbitai

Actionable comments posted: 17

🧹 Nitpick comments (9)

integ-test/src/test/java/org/opensearch/sql/security/CrossClusterSearchIT.java (1)
217-222: Assertions only verify column presence — consider asserting converted values.

Both tests call only verifyColumn(result, columnName(...)), which confirms the output column exists but does not validate that auto(balance) actually produced correct converted values. Compare with the existing testCrossClusterSortWithTypeCasting (line 162), which uses verifyDataRows to assert specific numeric values after a similar conversion. A column-name-only check would pass even if the conversion silently returned nulls or wrong types for every row.

Consider adding a verifyDataRows call with representative expected balance values (the same 7 rows already expected in testCrossClusterSortWithTypeCasting), or at minimum assert that the result is non-empty and the converted column is of numeric type.

As per coding guidelines, "Check test assertions are meaningful and specific."

Also applies to: 231-236
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@integ-test/src/test/java/org/opensearch/sql/security/CrossClusterSearchIT.java`
around lines 217 - 222, Current assertions in CrossClusterSearchIT call
verifyColumn(result, columnName("balance")) after executeQuery(...) which only
checks column presence; update the tests to assert actual converted numeric
values by invoking verifyDataRows(result, expectedRows...) using the same 7
expected balance values used in testCrossClusterSortWithTypeCasting, or at
minimum add assertions that the result is non-empty and that each value in the
converted "balance" column is numeric; locate the call sites around
executeQuery/String.format(...) with TEST_INDEX_BANK_REMOTE and replace or
extend the verifyColumn(...) checks with verifyDataRows(...) (or additional
numeric-type/non-empty assertions) to make the test meaningful.
integ-test/src/test/java/org/opensearch/sql/security/CalciteCrossClusterSearchIT.java (1)
450-459: Consider adding a verifyDataRows call for a more specific assertion.

Both new tests assert only column presence and schema type, which is consistent with the existing pattern in this class (e.g., testCrossClusterFieldsSpaceDelimited). However, the coding guidelines ask that integration test assertions be "meaningful and specific." Adding at least one row-value check would strengthen confidence that actual values are being converted correctly across the cluster boundary, not just that the projected type is correct.
💡 Optional: add row-value assertion
  `@Test`
  public void testCrossClusterConvertWithAlias() throws IOException {
    JSONObject result =
        executeQuery(
            String.format(
-               "search source=%s | convert auto(balance) AS balance_num | fields balance_num",
+               "search source=%s | where firstname='Hattie' | convert auto(balance) AS balance_num | fields balance_num",
                TEST_INDEX_BANK_REMOTE));
    verifyColumn(result, columnName("balance_num"));
    verifySchema(result, schema("balance_num", "double"));
+   verifyDataRows(result, rows(5686.0));
  }
The same pattern could apply to testCrossClusterConvert.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@integ-test/src/test/java/org/opensearch/sql/security/CalciteCrossClusterSearchIT.java`
around lines 450 - 459, Add a concrete row-value assertion to
testCrossClusterConvertWithAlias: after the existing verifyColumn(result,
columnName("balance_num")) and verifySchema(result, schema("balance_num",
"double")), call verifyDataRows(result, rows("balance_num",
"<expected_double>")) to assert at least one converted numeric value is correct
(replace <expected_double> with the known balance value from
TEST_INDEX_BANK_REMOTE as a double); reference symbols:
testCrossClusterConvertWithAlias, executeQuery, verifyColumn, verifySchema,
verifyDataRows.
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.java (1)
1-272: Missing negative test cases required by PPL guidelines

This file path falls under **/ppl/**/*.java, which mandates verifying grammar tests with positive and negative cases. The entire suite is positive-path only. There are no tests verifying parser/planner behavior for:

An unrecognised conversion function name (e.g., convert unknown_fn(SAL))

A missing argument (e.g., convert auto())

A non-existent source field (e.g., convert auto(NONEXISTENT_COL))

none() called with an alias that shadows an existing column

At minimum, an exception/error-propagation test for an invalid function name should be added to confirm the parser rejects it cleanly.

As per coding guidelines, For PPL parser changes, verify grammar tests with positive/negative cases.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.java`
around lines 1 - 272, The test suite in CalcitePPLConvertTest only covers
positive convert cases; add negative grammar tests to validate error handling
(at minimum a test that asserts an invalid conversion function is rejected). In
CalcitePPLConvertTest add a new `@Test` method (e.g., testConvertInvalidFunction)
that calls getRelNode("source=EMP | convert unknown_fn(SAL)") and asserts an
exception is thrown (use assertThrows or the existing test harness), and
similarly add tests for missing argument ("convert auto()"), non-existent field
("convert auto(NONEXISTENT_COL)"), and alias shadowing with none() (e.g.,
"convert none(EMPNO) AS EMPNO") to verify the parser/planner raises/propagates
errors instead of succeeding; use getRelNode and the existing verification
helpers (verifyLogical/verifyPPLToSparkSQL) pattern for positive vs negative
assertions.
core/src/main/java/org/opensearch/sql/expression/function/udf/BaseConversionUDF.java (1)
25-53: Missing JavaDoc on public API methods.

Per coding guidelines, public methods in core/src/main/java/**/*.java must have JavaDoc with @param, @return, and @throws. The public methods getReturnTypeInference(), getOperandMetadata(), and convertValue() (which is the core template method) lack documentation.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/BaseConversionUDF.java`
around lines 25 - 53, Add JavaDoc for the public API methods: add Javadoc
comments to BaseConversionUDF#getReturnTypeInference() and
BaseConversionUDF#getOperandMetadata() including a brief description and
appropriate `@return` tags; also locate the core template method convertValue(...)
(in this class or ConversionImplementor) and add Javadoc describing its purpose,
parameters (`@param` for each argument), return value (`@return`) and any exceptions
it can throw (`@throws`). Ensure the Javadoc text follows project conventions and
documents nullability/behavior where relevant.
core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java (1)
425-427: Optional: Add JavaDoc to visitConvert.

All other visitor method stubs in this class lack JavaDoc; this new method follows that same pattern. However, the core/src/main/java/**/*.java guideline requires public methods to have JavaDoc with @param, @return, and @throws. Consider adding it here (and ideally addressing the systemic gap in a follow-up).
📝 Suggested JavaDoc
+  /**
+   * Visit Convert node.
+   *
+   * `@param` node {`@link` Convert}
+   * `@param` context Context
+   * `@return` Return Type.
+   */
   public T visitConvert(Convert node, C context) {
     return visitChildren(node, context);
   }
As per coding guidelines: "Public methods MUST have JavaDoc with @param, @return, and @throws" (applicable to core/src/main/java/**/*.java).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java` around
lines 425 - 427, Add JavaDoc to the public method visitConvert in
AbstractNodeVisitor: document the method purpose and include `@param` for the
Convert node and context (C context), `@return` describing the returned generic T,
and `@throws` for any runtime exceptions that may be propagated; ensure the
JavaDoc style matches other methods in AbstractNodeVisitor and complies with the
core/src/main/java/**/*.java requirement for public methods to have `@param`,
`@return`, and `@throws` tags.
ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java (1)
1046-1062: memk() conversion function has no anonymizer test coverage.

All other conversion functions (auto, num, rmcomma, rmunit, none) have at least one assertion in testConvertCommand, but memk is absent. Add a case such as:
+    assertEquals(
+        "source=table | convert memk(identifier)",
+        anonymize("source=t | convert memk(memory)"));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java`
around lines 1046 - 1062, The testConvertCommand in PPLQueryDataAnonymizerTest
is missing coverage for the memk() conversion; add an assertion that calls
anonymize with an input using memk (e.g., "source=t | convert memk(field)") and
expects the anonymized output to map the field to the generic identifier form
consistent with other cases (e.g., "source=table | convert memk(identifier)" or
with AS when appropriate). Update the testConvertCommand method to include this
new assertEquals using the existing anonymize helper so memk conversion is
validated alongside auto, num, rmcomma, rmunit, and none.
core/src/main/java/org/opensearch/sql/ast/tree/Convert.java (1)
19-26: Avoid public setters to keep the AST node immutable.

@Setter exposes mutable state outside attach, which weakens immutability guarantees for AST nodes. Consider removing the class-level @Setter to keep mutation scoped.
♻️ Proposed change
-@Setter
 `@ToString`
 `@EqualsAndHashCode`(callSuper = false)
 `@RequiredArgsConstructor`
 public class Convert extends UnresolvedPlan {
As per coding guidelines: `core/src/main/java/org/opensearch/sql/ast/**/*.java`: AST nodes must be immutable where possible.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/src/main/java/org/opensearch/sql/ast/tree/Convert.java` around lines 19
- 26, Remove the class-level `@Setter` to prevent broad mutation of the AST node;
keep `@Getter/`@ToString/@EqualsAndHashCode/@RequiredArgsConstructor. Replace the
broad setter by adding a narrowly scoped mutator (e.g., a package-private method
like attach(UnresolvedPlan child) or attachChild(UnresolvedPlan child)) that
sets the private field child so only callers in the same package (or the
intended attach logic) can modify it; leave conversions as final and immutable
via the existing field.
core/src/test/java/org/opensearch/sql/expression/function/udf/ConversionFunctionsTest.java (2)
316-327: RmcommaConvertFunction returns null for all scientific-notation extreme values.

Lines 318–324 assert null for strings like "1.7976931348623157E308" and "1e309" because the E/e character triggers containsLetter(). This means rmcomma silently rejects any scientific-notation input. If this is intentional, a brief comment in the test explaining why would help future maintainers — containsLetter treats e as alphabetic, so all exponential representations are rejected.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@core/src/test/java/org/opensearch/sql/expression/function/udf/ConversionFunctionsTest.java`
around lines 316 - 327, The test currently asserts null for scientific-notation
strings because RmcommaConvertFunction.convert uses containsLetter(), which
treats 'e'/'E' as alphabetic and thus rejects exponential notation; update the
test method testRmcommaConvertNumericExtremes to include a brief inline comment
(near the assertions referencing RmcommaConvertFunction.convert) stating that
scientific/exponential formats are intentionally rejected due to
containsLetter() treating 'e' as a letter, so these inputs return null, to help
future maintainers understand the behavior.
228-236: Special-value tests (Infinity, NaN, ∞) only cover Num and Auto.

RmunitConvertFunction, RmcommaConvertFunction, and MemkConvertFunction are not tested against these special value strings. Per guidelines, error condition tests should be included for all new functions. Consider extending this test (or adding a sibling) to assert the expected behavior for the remaining three functions.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@core/src/test/java/org/opensearch/sql/expression/function/udf/ConversionFunctionsTest.java`
around lines 228 - 236, Extend the special-value test so RmunitConvertFunction,
RmcommaConvertFunction, and MemkConvertFunction are asserted to behave the same
as NumConvertFunction/AutoConvertFunction for the strings "∞", "Infinity", and
"NaN"; update ConversionFunctionsTest.testSpecialValues (or add a sibling test)
to call RmunitConvertFunction.convert(...), RmcommaConvertFunction.convert(...),
and MemkConvertFunction.convert(...) for each special string and assertNull on
each result to cover all new functions.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@core/src/main/java/org/opensearch/sql/expression/function/BuiltinFunctionName.java`:
- Around line 303-313: The enum entries CTIME, MKTIME, DUR2SEC, and MSTIME in
BuiltinFunctionName are declared but have no implementations or registrations
and will fail at runtime; either implement corresponding ConvertFunction classes
and register them in PPLFuncImpTable (similar to AUTO/NUM/MEMK/RMUNIT/RMCOMMA)
and add unit tests, or remove these four enum entries from BuiltinFunctionName
to avoid exposing unimplemented functions (note NONE is handled as a
special-case in AstBuilder.java and can remain). Ensure any new implementations
follow the existing ConvertFunction pattern and are included in test coverage
and the function dispatch table.

In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/AutoConvertFunction.java`:
- Around line 11-17: Add JavaDoc comments for the public AutoConvertFunction()
constructor and the public static convert(Object value) entrypoint: for
AutoConvertFunction(), add a short description and an `@throws` tag if the
constructor can propagate any runtime exceptions (or explicitly state none are
thrown); for convert(Object) add a description, an `@param` value explaining
accepted input, an `@return` describing the conversion result, and an `@throws`
documenting the runtime exception(s) that may be raised (e.g., if
convertValue(value) fails). Reference the constructor AutoConvertFunction(), the
static method convert(Object), and the instance helper convertValue(Object) in
the comments so maintainers can trace behavior.

In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/MemkConvertFunction.java`:
- Around line 11-17: Add JavaDoc comments for the public MemkConvertFunction()
constructor and the public static convert(Object value) entrypoint: for
MemkConvertFunction() document its purpose and any side effects and include
`@throws` if it can throw (or explicitly state none), and for convert(Object
value) document the parameter (`@param` value the input to convert), the return
value (`@return` the converted object or null) and any exceptions it can propagate
(`@throws` ... or state none); reference the convertValue(Object) helper in the
convert method description to clarify behavior.

In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/NumConvertFunction.java`:
- Around line 11-17: Add JavaDoc for the public API: document the
NumConvertFunction() constructor and the static method convert(Object value)
with `@param` describing the value argument, `@return` describing the returned
converted numeric/object result, and `@throws` for any runtime exceptions that
convertValue may propagate; reference the NumConvertFunction.convert(Object)
method and the NumConvertFunction() constructor and ensure the JavaDoc text
matches coding guidelines for public methods.

In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/RmcommaConvertFunction.java`:
- Around line 11-17: Add JavaDoc to the public no-arg constructor
RmcommaConvertFunction() and the static entrypoint convert(Object value): for
the constructor, add a short description of the class initialization and include
any `@throws` if construction can fail; for the static convert method, add a
description, an `@param` value describing accepted input types, an `@return`
describing the converted value (or null) and an `@throws` documenting the
exceptions that may be raised during conversion (e.g., IllegalArgumentException
for invalid input or RuntimeException for conversion errors). Reference the
convertValue(Object) call inside convert(Object) so the JavaDoc reflects that
the static method delegates to convertValue for the actual conversion.

In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/RmunitConvertFunction.java`:
- Around line 11-17: Add JavaDoc comments for the public constructor
RmunitConvertFunction() and the static entrypoint method convert(Object) that
document parameters, return value and possible exceptions; for the constructor
include a brief description and note any thrown runtime exceptions if
applicable, and for convert(Object) add `@param` value describing accepted inputs,
`@return` describing the returned converted object (or null) and `@throws` entries
for any runtime exceptions propagated by convertValue(Object) so the JavaDoc
satisfies the project's `@param/`@return/@throws requirement while referencing the
convertValue(Object) behavior.

In
`@core/src/test/java/org/opensearch/sql/expression/function/udf/ConversionFunctionsTest.java`:
- Around line 175-215: Add missing zero-boundary and numeric-extreme test cases
for MemkConvertFunction in ConversionFunctionsTest: create a test method (e.g.,
testMemkConvertZeroBoundaries) that asserts convert("0"), convert("0k"),
convert("0m"), convert("0g"), convert("+0") and convert("-0") return 0.0 (or
-0.0 as appropriate), and another method (e.g., testMemkConvertNumericExtremes)
that asserts convert(String.valueOf(Double.MAX_VALUE)) returns Double.MAX_VALUE
and that overflow/invalid huge values like "1e309" and "1e309k" return null; use
MemkConvertFunction.convert references to locate where to add these tests.

In `@docs/user/ppl/cmd/convert.md`:
- Around line 84-100: The examples use non-existent indices
(source=system_metrics in the convert example and source=metrics in the other
example) so update both examples to use the existing test index source=accounts
and simulate the intended fields with eval; locate the convert example
containing the pipeline with "source=system_metrics | eval memory='100m' |
convert memk(memory)" and the other example referencing "source=metrics" and
replace their source clauses with "source=accounts" and add an eval that creates
the memory/metric field values (e.g., eval memory='100m' or eval
<field>=<value>) so the examples run against the standard test data while
keeping the rest of the pipeline unchanged.
- Around line 60-78: The example's displayed row count is wrong for the query
`source=accounts | eval price='1,234' | convert num(price) | fields price`: it
will produce one result row per document in the index (all with price=1234.0).
Either append `| head 1` to the example query to legitimately show "fetched rows
/ total rows = 1/1", or update the shown result block to reflect the full-row
counts (e.g., fetched rows / total rows = N/N) and keep the original query;
modify the example in convert.md accordingly so the query and displayed result
are consistent.
- Around line 237-240: Fix the markdown table closing fence by replacing the
trailing pipe character in the closing separator string
"+------------+-----------+----------|" with a plus so it becomes
"+------------+-----------+----------+"; look for the table block that contains
the rows with "6 | Hattie | Bond" and "13 | Nanette | Bates" and update the
final separator line accordingly to correct the table formatting.
- Around line 19-21: The docs currently list `*` as a supported value for the
`<field>` parameter in the convert command table, but wildcard support is not
implemented (see the "when wildcard support is implemented" comment later);
update the `<field>` row in the parameter table to either remove the `*` example
or explicitly mark it as "not yet supported" (e.g., change "Single field name to
convert, or `*` to convert all fields." to "Single field name to convert;
wildcard `*` not yet supported."). Ensure the text mentions `*` and `<field>` so
reviewers can find and verify the change, and keep the rest of the table
unchanged.

In `@integ-test/src/test/java/org/opensearch/sql/ppl/ConvertCommandIT.java`:
- Around line 72-75: The test method verifyQueryThrowsCalciteError uses
assertThrows but the JUnit 5 static import is missing; add a static import for
org.junit.jupiter.api.Assertions.assertThrows at the top of the test file so the
call in verifyQueryThrowsCalciteError(...) resolves correctly when using JUnit 5
`@Test`.

In `@integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java`:
- Around line 229-238: The test testConvertCommand currently only calls
verifyQuery(result) in the ResponseException catch path, so when executeQuery
succeeds (Calcite enabled) nothing is asserted; change the flow to capture the
successful response into the existing JSONObject result inside the try (assign
the executeQuery return into result) and move a single verifyQuery(result) call
outside the try/catch so it runs for both success and failure; keep the catch
building result from TestUtils.getResponseBody(e.getResponse()) as it does now
so verifyQuery sees the error response when an exception occurs.

In
`@integ-test/src/test/java/org/opensearch/sql/security/CrossClusterSearchIT.java`:
- Around line 213-239: Both testCrossClusterConvert and
testCrossClusterConvertWithAlias enable Calcite with enableCalcite() but call
disableCalcite() only on the happy path; wrap the body of each test in a try {
... } finally { disableCalcite(); } so disableCalcite() always runs even if
executeQuery(...) or verifyColumn(...) throws; locate the two test methods
(testCrossClusterConvert and testCrossClusterConvertWithAlias) and move their
enableCalcite() to before the try and ensure the executeQuery/verifyColumn calls
are inside the try block with disableCalcite() in the finally.

In `@ppl/src/main/antlr/OpenSearchPPLLexer.g4`:
- Line 163: The TIMEFORMAT lexer token is unused; either remove the TIMEFORMAT
token definition from OpenSearchPPLLexer.g4 to eliminate dead code, or if it's
intended, add corresponding parser grammar rules and/or Java references that
consume TIMEFORMAT (e.g., incorporate it into relevant parser rules that parse
time formats or make the token part of an existing rule), then regenerate the
lexer/parser and update any tests; search for TIMEFORMAT in the project to
confirm no existing usage before deleting or integrate it where parser rules
expect a time format token.

In `@ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.java`:
- Around line 257-271: The test method testConvertAutoWithMemoryField is
misnamed because it uses auto(JOB) aliased as memory_size while JOB is a string
job-title field; rename the test to reflect aliasing behavior (e.g.,
testConvertAutoAliasOnStringField or testConvertAutoWithAlias) or change the
input to use a memory-like field (e.g., replace JOB with a numeric/string
memory_field that contains values like "10k") so the name matches behavior;
update the method name testConvertAutoWithMemoryField and any associated
expected strings (retain auto(JOB) and memory_size if renaming, or replace
auto(JOB)/memory_size occurrences if you change the field) to keep assertions
consistent.
- Around line 1-272: The test class CalcitePPLConvertTest lacks required
NULL-input, boundary-condition, and error-condition tests for the five
conversion functions (auto, num, rmcomma, rmunit, memk); add unit tests that
exercise each function with (1) a nullable field (NULL-input) e.g., use COMM or
another nullable column and assert getRelNode(ppl) + verifyLogical(root, ...)
and verifyPPLToSparkSQL(root, ...), (2) a boundary/empty input case (e.g.,
rmcomma applied to fields that can be empty or empty-string equivalent) and
assert plan/SQL, and (3) an error-condition test that passes an invalid function
or argument and expects a parse/validation exception (use `@Test`(expected =
SomeParseException.class) around getRelNode(ppl)); implement these tests in
CalcitePPLConvertTest using the existing helpers getRelNode, verifyLogical, and
verifyPPLToSparkSQL and name them clearly like testConvertAutoWithNullInput,
testConvertNumBoundaryEmpty, testConvertInvalidFunctionThrows, etc.

---

Nitpick comments:
In `@core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java`:
- Around line 425-427: Add JavaDoc to the public method visitConvert in
AbstractNodeVisitor: document the method purpose and include `@param` for the
Convert node and context (C context), `@return` describing the returned generic T,
and `@throws` for any runtime exceptions that may be propagated; ensure the
JavaDoc style matches other methods in AbstractNodeVisitor and complies with the
core/src/main/java/**/*.java requirement for public methods to have `@param`,
`@return`, and `@throws` tags.

In `@core/src/main/java/org/opensearch/sql/ast/tree/Convert.java`:
- Around line 19-26: Remove the class-level `@Setter` to prevent broad mutation of
the AST node; keep
`@Getter/`@ToString/@EqualsAndHashCode/@RequiredArgsConstructor. Replace the broad
setter by adding a narrowly scoped mutator (e.g., a package-private method like
attach(UnresolvedPlan child) or attachChild(UnresolvedPlan child)) that sets the
private field child so only callers in the same package (or the intended attach
logic) can modify it; leave conversions as final and immutable via the existing
field.

In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/BaseConversionUDF.java`:
- Around line 25-53: Add JavaDoc for the public API methods: add Javadoc
comments to BaseConversionUDF#getReturnTypeInference() and
BaseConversionUDF#getOperandMetadata() including a brief description and
appropriate `@return` tags; also locate the core template method convertValue(...)
(in this class or ConversionImplementor) and add Javadoc describing its purpose,
parameters (`@param` for each argument), return value (`@return`) and any exceptions
it can throw (`@throws`). Ensure the Javadoc text follows project conventions and
documents nullability/behavior where relevant.

In
`@core/src/test/java/org/opensearch/sql/expression/function/udf/ConversionFunctionsTest.java`:
- Around line 316-327: The test currently asserts null for scientific-notation
strings because RmcommaConvertFunction.convert uses containsLetter(), which
treats 'e'/'E' as alphabetic and thus rejects exponential notation; update the
test method testRmcommaConvertNumericExtremes to include a brief inline comment
(near the assertions referencing RmcommaConvertFunction.convert) stating that
scientific/exponential formats are intentionally rejected due to
containsLetter() treating 'e' as a letter, so these inputs return null, to help
future maintainers understand the behavior.
- Around line 228-236: Extend the special-value test so RmunitConvertFunction,
RmcommaConvertFunction, and MemkConvertFunction are asserted to behave the same
as NumConvertFunction/AutoConvertFunction for the strings "∞", "Infinity", and
"NaN"; update ConversionFunctionsTest.testSpecialValues (or add a sibling test)
to call RmunitConvertFunction.convert(...), RmcommaConvertFunction.convert(...),
and MemkConvertFunction.convert(...) for each special string and assertNull on
each result to cover all new functions.

In
`@integ-test/src/test/java/org/opensearch/sql/security/CalciteCrossClusterSearchIT.java`:
- Around line 450-459: Add a concrete row-value assertion to
testCrossClusterConvertWithAlias: after the existing verifyColumn(result,
columnName("balance_num")) and verifySchema(result, schema("balance_num",
"double")), call verifyDataRows(result, rows("balance_num",
"<expected_double>")) to assert at least one converted numeric value is correct
(replace <expected_double> with the known balance value from
TEST_INDEX_BANK_REMOTE as a double); reference symbols:
testCrossClusterConvertWithAlias, executeQuery, verifyColumn, verifySchema,
verifyDataRows.

In
`@integ-test/src/test/java/org/opensearch/sql/security/CrossClusterSearchIT.java`:
- Around line 217-222: Current assertions in CrossClusterSearchIT call
verifyColumn(result, columnName("balance")) after executeQuery(...) which only
checks column presence; update the tests to assert actual converted numeric
values by invoking verifyDataRows(result, expectedRows...) using the same 7
expected balance values used in testCrossClusterSortWithTypeCasting, or at
minimum add assertions that the result is non-empty and that each value in the
converted "balance" column is numeric; locate the call sites around
executeQuery/String.format(...) with TEST_INDEX_BANK_REMOTE and replace or
extend the verifyColumn(...) checks with verifyDataRows(...) (or additional
numeric-type/non-empty assertions) to make the test meaningful.

In `@ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.java`:
- Around line 1-272: The test suite in CalcitePPLConvertTest only covers
positive convert cases; add negative grammar tests to validate error handling
(at minimum a test that asserts an invalid conversion function is rejected). In
CalcitePPLConvertTest add a new `@Test` method (e.g., testConvertInvalidFunction)
that calls getRelNode("source=EMP | convert unknown_fn(SAL)") and asserts an
exception is thrown (use assertThrows or the existing test harness), and
similarly add tests for missing argument ("convert auto()"), non-existent field
("convert auto(NONEXISTENT_COL)"), and alias shadowing with none() (e.g.,
"convert none(EMPNO) AS EMPNO") to verify the parser/planner raises/propagates
errors instead of succeeding; use getRelNode and the existing verification
helpers (verifyLogical/verifyPPLToSparkSQL) pattern for positive vs negative
assertions.

In
`@ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java`:
- Around line 1046-1062: The testConvertCommand in PPLQueryDataAnonymizerTest is
missing coverage for the memk() conversion; add an assertion that calls
anonymize with an input using memk (e.g., "source=t | convert memk(field)") and
expects the anonymized output to map the field to the generic identifier form
consistent with other cases (e.g., "source=table | convert memk(identifier)" or
with AS when appropriate). Update the testConvertCommand method to include this
new assertEquals using the existing anonymize helper so memk conversion is
validated alongside auto, num, rmcomma, rmunit, and none.

core/src/main/java/org/opensearch/sql/expression/function/BuiltinFunctionName.java

coderabbitai · 2026-02-19T17:42:23Z

core/src/main/java/org/opensearch/sql/expression/function/udf/AutoConvertFunction.java

+  public AutoConvertFunction() {
+    super(AutoConvertFunction.class);
+  }
+
+  public static Object convert(Object value) {
+    return new AutoConvertFunction().convertValue(value);
+  }


⚠️ Potential issue | 🟡 Minor

Add JavaDoc for the public constructor and convert entrypoint.

These public methods currently lack JavaDoc with @param/@return/@throws. Please add the required JavaDoc for the constructor and static convert method.

As per coding guidelines: "core/src/main/java/**/*.java: Public methods MUST have JavaDoc with @param, @return, and @throws".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/org/opensearch/sql/expression/function/udf/AutoConvertFunction.java` around lines 11 - 17, Add JavaDoc comments for the public AutoConvertFunction() constructor and the public static convert(Object value) entrypoint: for AutoConvertFunction(), add a short description and an `@throws` tag if the constructor can propagate any runtime exceptions (or explicitly state none are thrown); for convert(Object) add a description, an `@param` value explaining accepted input, an `@return` describing the conversion result, and an `@throws` documenting the runtime exception(s) that may be raised (e.g., if convertValue(value) fails). Reference the constructor AutoConvertFunction(), the static method convert(Object), and the instance helper convertValue(Object) in the comments so maintainers can trace behavior.

coderabbitai · 2026-02-19T17:42:23Z

core/src/main/java/org/opensearch/sql/expression/function/udf/MemkConvertFunction.java

+  public MemkConvertFunction() {
+    super(MemkConvertFunction.class);
+  }
+
+  public static Object convert(Object value) {
+    return new MemkConvertFunction().convertValue(value);
+  }


⚠️ Potential issue | 🟡 Minor

Add JavaDoc for the public constructor and convert entrypoint.

These public methods currently lack JavaDoc with @param/@return/@throws. Please add the required JavaDoc for the constructor and static convert method.

As per coding guidelines: "core/src/main/java/**/*.java: Public methods MUST have JavaDoc with @param, @return, and @throws".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/org/opensearch/sql/expression/function/udf/MemkConvertFunction.java` around lines 11 - 17, Add JavaDoc comments for the public MemkConvertFunction() constructor and the public static convert(Object value) entrypoint: for MemkConvertFunction() document its purpose and any side effects and include `@throws` if it can throw (or explicitly state none), and for convert(Object value) document the parameter (`@param` value the input to convert), the return value (`@return` the converted object or null) and any exceptions it can propagate (`@throws` ... or state none); reference the convertValue(Object) helper in the convert method description to clarify behavior.

coderabbitai · 2026-02-19T17:42:23Z

core/src/main/java/org/opensearch/sql/expression/function/udf/NumConvertFunction.java

+  public NumConvertFunction() {
+    super(NumConvertFunction.class);
+  }
+
+  public static Object convert(Object value) {
+    return new NumConvertFunction().convertValue(value);
+  }


⚠️ Potential issue | 🟡 Minor

Document the public convert API.

The public convert method (and constructor if kept public) should have JavaDoc with @param/@return.

✍️ Example JavaDoc

+ /** + * Convert a value using num() semantics. + * + * `@param` value input value + * `@return` converted double, or null if not convertible + */ public static Object convert(Object value) { return new NumConvertFunction().convertValue(value); }

As per coding guidelines: `core/src/main/java/**/*.java`: Public methods MUST have JavaDoc with `@param`, `@return`, and `@throws`.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/org/opensearch/sql/expression/function/udf/NumConvertFunction.java` around lines 11 - 17, Add JavaDoc for the public API: document the NumConvertFunction() constructor and the static method convert(Object value) with `@param` describing the value argument, `@return` describing the returned converted numeric/object result, and `@throws` for any runtime exceptions that convertValue may propagate; reference the NumConvertFunction.convert(Object) method and the NumConvertFunction() constructor and ensure the JavaDoc text matches coding guidelines for public methods.

coderabbitai · 2026-02-19T17:42:23Z

core/src/main/java/org/opensearch/sql/expression/function/udf/RmcommaConvertFunction.java

+  public RmcommaConvertFunction() {
+    super(RmcommaConvertFunction.class);
+  }
+
+  public static Object convert(Object value) {
+    return new RmcommaConvertFunction().convertValue(value);
+  }


⚠️ Potential issue | 🟡 Minor

Add JavaDoc for the public constructor and convert entrypoint.

These public methods currently lack JavaDoc with @param/@return/@throws. Please add the required JavaDoc for the constructor and static convert method.

As per coding guidelines: "core/src/main/java/**/*.java: Public methods MUST have JavaDoc with @param, @return, and @throws".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/org/opensearch/sql/expression/function/udf/RmcommaConvertFunction.java` around lines 11 - 17, Add JavaDoc to the public no-arg constructor RmcommaConvertFunction() and the static entrypoint convert(Object value): for the constructor, add a short description of the class initialization and include any `@throws` if construction can fail; for the static convert method, add a description, an `@param` value describing accepted input types, an `@return` describing the converted value (or null) and an `@throws` documenting the exceptions that may be raised during conversion (e.g., IllegalArgumentException for invalid input or RuntimeException for conversion errors). Reference the convertValue(Object) call inside convert(Object) so the JavaDoc reflects that the static method delegates to convertValue for the actual conversion.

integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java

coderabbitai · 2026-02-19T17:42:24Z

integ-test/src/test/java/org/opensearch/sql/security/CrossClusterSearchIT.java

+  @Test
+  public void testCrossClusterConvert() throws IOException {
+    enableCalcite();
+
+    JSONObject result =
+        executeQuery(
+            String.format(
+                "search source=%s | convert auto(balance) | fields balance",
+                TEST_INDEX_BANK_REMOTE));
+    verifyColumn(result, columnName("balance"));
+
+    disableCalcite();
+  }
+
+  @Test
+  public void testCrossClusterConvertWithAlias() throws IOException {
+    enableCalcite();
+
+    JSONObject result =
+        executeQuery(
+            String.format(
+                "search source=%s | convert auto(balance) AS balance_num | fields balance_num",
+                TEST_INDEX_BANK_REMOTE));
+    verifyColumn(result, columnName("balance_num"));
+
+    disableCalcite();
+  }


⚠️ Potential issue | 🟠 Major

disableCalcite() must be called in a finally block to prevent Calcite state leaking into subsequent tests.

Both testCrossClusterConvert and testCrossClusterConvertWithAlias enable Calcite at the top but only disable it on the happy path. Any exception thrown from executeQuery or verifyColumn (e.g., an assertion failure, an IOException, or a network error) will skip disableCalcite(), leaving Calcite enabled for every test that runs afterwards. This violates test independence and can cause cascading, order-dependent failures in the rest of the suite.

🛡️ Proposed fix: wrap in try-finally

`@Test` public void testCrossClusterConvert() throws IOException { enableCalcite(); - - JSONObject result = - executeQuery( - String.format( - "search source=%s | convert auto(balance) | fields balance", - TEST_INDEX_BANK_REMOTE)); - verifyColumn(result, columnName("balance")); - - disableCalcite(); + try { + JSONObject result = + executeQuery( + String.format( + "search source=%s | convert auto(balance) | fields balance", + TEST_INDEX_BANK_REMOTE)); + verifyColumn(result, columnName("balance")); + } finally { + disableCalcite(); + } } `@Test` public void testCrossClusterConvertWithAlias() throws IOException { enableCalcite(); - - JSONObject result = - executeQuery( - String.format( - "search source=%s | convert auto(balance) AS balance_num | fields balance_num", - TEST_INDEX_BANK_REMOTE)); - verifyColumn(result, columnName("balance_num")); - - disableCalcite(); + try { + JSONObject result = + executeQuery( + String.format( + "search source=%s | convert auto(balance) AS balance_num | fields balance_num", + TEST_INDEX_BANK_REMOTE)); + verifyColumn(result, columnName("balance_num")); + } finally { + disableCalcite(); + } }

As per coding guidelines, "Tests must not rely on execution order; ensure test independence" and "Ensure tests are independent and can run in any order."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@integ-test/src/test/java/org/opensearch/sql/security/CrossClusterSearchIT.java` around lines 213 - 239, Both testCrossClusterConvert and testCrossClusterConvertWithAlias enable Calcite with enableCalcite() but call disableCalcite() only on the happy path; wrap the body of each test in a try { ... } finally { disableCalcite(); } so disableCalcite() always runs even if executeQuery(...) or verifyColumn(...) throws; locate the two test methods (testCrossClusterConvert and testCrossClusterConvertWithAlias) and move their enableCalcite() to before the try and ensure the executeQuery/verifyColumn calls are inside the try block with disableCalcite() in the finally.

coderabbitai · 2026-02-19T17:42:24Z

ppl/src/main/antlr/OpenSearchPPLLexer.g4

 OTHERSTR:                           'OTHERSTR';
 NULLSTR:                            'NULLSTR';
 TIMEFIELD:                          'TIMEFIELD';
+TIMEFORMAT:                         'TIMEFORMAT';


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check if TIMEFORMAT is referenced in any parser grammar or Java source files echo "=== References in .g4 parser files ===" rg -n 'TIMEFORMAT' -t g4 2>/dev/null || echo "No g4 type found, trying without type filter..." rg -n 'TIMEFORMAT' --type-list | grep -i g4 > /dev/null && rg -n 'TIMEFORMAT' --type-add 'g4:*.g4' -t g4 || rg -n 'TIMEFORMAT' '\.g4$' echo "" echo "=== References in Java files ===" rg -n 'TIMEFORMAT' -t java 2>/dev/null | head -20 echo "" echo "=== All references in repo ===" rg -n 'TIMEFORMAT' | head -30

Repository: opensearch-project/sql

Length of output: 766

Remove the unused TIMEFORMAT token or verify it's needed elsewhere.

The TIMEFORMAT token at line 163 is defined in the lexer but has no references anywhere in the parser grammar or Java codebase. If this was added inadvertently or is planned for future use, either integrate it into parser rules or remove it to avoid dead code.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@ppl/src/main/antlr/OpenSearchPPLLexer.g4` at line 163, The TIMEFORMAT lexer token is unused; either remove the TIMEFORMAT token definition from OpenSearchPPLLexer.g4 to eliminate dead code, or if it's intended, add corresponding parser grammar rules and/or Java references that consume TIMEFORMAT (e.g., incorporate it into relevant parser rules that parse time formats or make the token part of an existing rule), then regenerate the lexer/parser and update any tests; search for TIMEFORMAT in the project to confirm no existing usage before deleting or integrate it where parser rules expect a time format token.

coderabbitai · 2026-02-19T17:42:24Z

ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.java

+/*
+ * Copyright OpenSearch Contributors
+ * SPDX-License-Identifier: Apache-2.0
+ */
+
+package org.opensearch.sql.ppl.calcite;
+
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.test.CalciteAssert;
+import org.junit.Test;
+
+/** Unit tests for PPL convert command. */
+public class CalcitePPLConvertTest extends CalcitePPLAbstractTest {
+
+  public CalcitePPLConvertTest() {
+    super(CalciteAssert.SchemaSpec.SCOTT_WITH_TEMPORAL);
+  }
+
+  @Test
+  public void testConvertBasic() {
+    String ppl = "source=EMP | convert auto(SAL)";
+    RelNode root = getRelNode(ppl);
+    String expectedLogical =
+        "LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[AUTO($5)],"
+            + " COMM=[$6], DEPTNO=[$7])\n"
+            + "  LogicalTableScan(table=[[scott, EMP]])\n";
+    verifyLogical(root, expectedLogical);
+
+    String expectedSparkSql =
+        "SELECT `EMPNO`, `ENAME`, `JOB`, `MGR`, `HIREDATE`, AUTO(`SAL`) `SAL`, `COMM`, `DEPTNO`\n"
+            + "FROM `scott`.`EMP`";
+    verifyPPLToSparkSQL(root, expectedSparkSql);
+  }
+
+  @Test
+  public void testConvertWithAlias() {
+    String ppl = "source=EMP | convert auto(SAL) AS salary_num";
+    RelNode root = getRelNode(ppl);
+    String expectedLogical =
+        "LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5],"
+            + " COMM=[$6], DEPTNO=[$7], salary_num=[AUTO($5)])\n"
+            + "  LogicalTableScan(table=[[scott, EMP]])\n";
+    verifyLogical(root, expectedLogical);
+
+    String expectedSparkSql =
+        "SELECT `EMPNO`, `ENAME`, `JOB`, `MGR`, `HIREDATE`, `SAL`, `COMM`, `DEPTNO`, AUTO(`SAL`)"
+            + " `salary_num`\n"
+            + "FROM `scott`.`EMP`";
+    verifyPPLToSparkSQL(root, expectedSparkSql);
+  }
+
+  @Test
+  public void testConvertMultipleFunctions() {
+    String ppl = "source=EMP | convert auto(SAL), num(COMM)";
+    RelNode root = getRelNode(ppl);
+    String expectedLogical =
+        "LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[AUTO($5)],"
+            + " COMM=[NUM($6)], DEPTNO=[$7])\n"
+            + "  LogicalTableScan(table=[[scott, EMP]])\n";
+    verifyLogical(root, expectedLogical);
+
+    String expectedSparkSql =
+        "SELECT `EMPNO`, `ENAME`, `JOB`, `MGR`, `HIREDATE`, AUTO(`SAL`) `SAL`, NUM(`COMM`) `COMM`,"
+            + " `DEPTNO`\n"
+            + "FROM `scott`.`EMP`";
+    verifyPPLToSparkSQL(root, expectedSparkSql);
+  }
+
+  @Test
+  public void testConvertMultipleWithAliases() {
+    String ppl = "source=EMP | convert auto(SAL) AS salary, num(COMM) AS commission";
+    RelNode root = getRelNode(ppl);
+    String expectedLogical =
+        "LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5],"
+            + " COMM=[$6], DEPTNO=[$7], salary=[AUTO($5)], commission=[NUM($6)])\n"
+            + "  LogicalTableScan(table=[[scott, EMP]])\n";
+    verifyLogical(root, expectedLogical);
+
+    String expectedSparkSql =
+        "SELECT `EMPNO`, `ENAME`, `JOB`, `MGR`, `HIREDATE`, `SAL`, `COMM`, `DEPTNO`, AUTO(`SAL`)"
+            + " `salary`, NUM(`COMM`) `commission`\n"
+            + "FROM `scott`.`EMP`";
+    verifyPPLToSparkSQL(root, expectedSparkSql);
+  }
+
+  @Test
+  public void testConvertWithFields() {
+    String ppl = "source=EMP | convert auto(SAL) AS salary_num | fields EMPNO, ENAME, salary_num";
+    RelNode root = getRelNode(ppl);
+    // Calcite optimizes the two projections into one - this is more efficient
+    String expectedLogical =
+        "LogicalProject(EMPNO=[$0], ENAME=[$1], salary_num=[AUTO($5)])\n"
+            + "  LogicalTableScan(table=[[scott, EMP]])\n";
+    verifyLogical(root, expectedLogical);
+
+    String expectedSparkSql =
+        "SELECT `EMPNO`, `ENAME`, AUTO(`SAL`) `salary_num`\n" + "FROM `scott`.`EMP`";
+    verifyPPLToSparkSQL(root, expectedSparkSql);
+  }
+
+  @Test
+  public void testConvertNumFunction() {
+    String ppl = "source=EMP | convert num(SAL)";
+    RelNode root = getRelNode(ppl);
+    String expectedLogical =
+        "LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[NUM($5)],"
+            + " COMM=[$6], DEPTNO=[$7])\n"
+            + "  LogicalTableScan(table=[[scott, EMP]])\n";
+    verifyLogical(root, expectedLogical);
+
+    String expectedSparkSql =
+        "SELECT `EMPNO`, `ENAME`, `JOB`, `MGR`, `HIREDATE`, NUM(`SAL`) `SAL`, `COMM`, `DEPTNO`\n"
+            + "FROM `scott`.`EMP`";
+    verifyPPLToSparkSQL(root, expectedSparkSql);
+  }
+
+  @Test
+  public void testConvertRmcommaFunction() {
+    String ppl = "source=EMP | convert rmcomma(ENAME)";
+    RelNode root = getRelNode(ppl);
+    String expectedLogical =
+        "LogicalProject(EMPNO=[$0], ENAME=[RMCOMMA($1)], JOB=[$2], MGR=[$3], HIREDATE=[$4],"
+            + " SAL=[$5], COMM=[$6], DEPTNO=[$7])\n"
+            + "  LogicalTableScan(table=[[scott, EMP]])\n";
+    verifyLogical(root, expectedLogical);
+
+    String expectedSparkSql =
+        "SELECT `EMPNO`, RMCOMMA(`ENAME`) `ENAME`, `JOB`, `MGR`, `HIREDATE`, `SAL`, `COMM`,"
+            + " `DEPTNO`\n"
+            + "FROM `scott`.`EMP`";
+    verifyPPLToSparkSQL(root, expectedSparkSql);
+  }
+
+  @Test
+  public void testConvertRmunitFunction() {
+    String ppl = "source=EMP | convert rmunit(ENAME)";
+    RelNode root = getRelNode(ppl);
+    String expectedLogical =
+        "LogicalProject(EMPNO=[$0], ENAME=[RMUNIT($1)], JOB=[$2], MGR=[$3], HIREDATE=[$4],"
+            + " SAL=[$5], COMM=[$6], DEPTNO=[$7])\n"
+            + "  LogicalTableScan(table=[[scott, EMP]])\n";
+    verifyLogical(root, expectedLogical);
+
+    String expectedSparkSql =
+        "SELECT `EMPNO`, RMUNIT(`ENAME`) `ENAME`, `JOB`, `MGR`, `HIREDATE`, `SAL`, `COMM`,"
+            + " `DEPTNO`\n"
+            + "FROM `scott`.`EMP`";
+    verifyPPLToSparkSQL(root, expectedSparkSql);
+  }
+
+  @Test
+  public void testConvertMemkFunction() {
+    String ppl = "source=EMP | convert memk(ENAME)";
+    RelNode root = getRelNode(ppl);
+    String expectedLogical =
+        "LogicalProject(EMPNO=[$0], ENAME=[MEMK($1)], JOB=[$2], MGR=[$3], HIREDATE=[$4],"
+            + " SAL=[$5], COMM=[$6], DEPTNO=[$7])\n"
+            + "  LogicalTableScan(table=[[scott, EMP]])\n";
+    verifyLogical(root, expectedLogical);
+
+    String expectedSparkSql =
+        "SELECT `EMPNO`, MEMK(`ENAME`) `ENAME`, `JOB`, `MGR`, `HIREDATE`, `SAL`, `COMM`,"
+            + " `DEPTNO`\n"
+            + "FROM `scott`.`EMP`";
+    verifyPPLToSparkSQL(root, expectedSparkSql);
+  }
+
+  @Test
+  public void testConvertNoneFunction() {
+    String ppl = "source=EMP | convert none(ENAME)";
+    RelNode root = getRelNode(ppl);
+    String expectedLogical = "LogicalTableScan(table=[[scott, EMP]])\n";
+    verifyLogical(root, expectedLogical);
+
+    String expectedSparkSql = "SELECT *\n" + "FROM `scott`.`EMP`";
+    verifyPPLToSparkSQL(root, expectedSparkSql);
+  }
+
+  @Test
+  public void testConvertWithWhere() {
+    String ppl = "source=EMP | where DEPTNO = 10 | convert auto(SAL)";
+    RelNode root = getRelNode(ppl);
+    String expectedLogical =
+        "LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[AUTO($5)],"
+            + " COMM=[$6], DEPTNO=[$7])\n"
+            + "  LogicalFilter(condition=[=($7, 10)])\n"
+            + "    LogicalTableScan(table=[[scott, EMP]])\n";
+    verifyLogical(root, expectedLogical);
+
+    String expectedSparkSql =
+        "SELECT `EMPNO`, `ENAME`, `JOB`, `MGR`, `HIREDATE`, AUTO(`SAL`) `SAL`, `COMM`, `DEPTNO`\n"
+            + "FROM `scott`.`EMP`\n"
+            + "WHERE `DEPTNO` = 10";
+    verifyPPLToSparkSQL(root, expectedSparkSql);
+  }
+
+  @Test
+  public void testConvertWithSort() {
+    String ppl = "source=EMP | convert auto(SAL) AS salary_num | sort - salary_num | head 3";
+    RelNode root = getRelNode(ppl);
+    String expectedLogical =
+        "LogicalSort(sort0=[$8], dir0=[DESC-nulls-last], fetch=[3])\n"
+            + "  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4],"
+            + " SAL=[$5], COMM=[$6], DEPTNO=[$7], salary_num=[AUTO($5)])\n"
+            + "    LogicalTableScan(table=[[scott, EMP]])\n";
+    verifyLogical(root, expectedLogical);
+
+    String expectedSparkSql =
+        "SELECT `EMPNO`, `ENAME`, `JOB`, `MGR`, `HIREDATE`, `SAL`, `COMM`, `DEPTNO`, AUTO(`SAL`)"
+            + " `salary_num`\n"
+            + "FROM `scott`.`EMP`\n"
+            + "ORDER BY 9 DESC\n"
+            + "LIMIT 3";
+    verifyPPLToSparkSQL(root, expectedSparkSql);
+  }
+
+  @Test
+  public void testConvertWithStats() {
+    String ppl = "source=EMP | convert auto(SAL) AS salary_num | stats avg(salary_num) by DEPTNO";
+    RelNode root = getRelNode(ppl);
+    String expectedLogical =
+        "LogicalProject(avg(salary_num)=[$1], DEPTNO=[$0])\n"
+            + "  LogicalAggregate(group=[{0}], avg(salary_num)=[AVG($1)])\n"
+            + "    LogicalProject(DEPTNO=[$7], salary_num=[AUTO($5)])\n"
+            + "      LogicalTableScan(table=[[scott, EMP]])\n";
+    verifyLogical(root, expectedLogical);
+
+    String expectedSparkSql =
+        "SELECT AVG(AUTO(`SAL`)) `avg(salary_num)`, `DEPTNO`\n"
+            + "FROM `scott`.`EMP`\n"
+            + "GROUP BY `DEPTNO`";
+    verifyPPLToSparkSQL(root, expectedSparkSql);
+  }
+
+  @Test
+  public void testConvertAllFunctions() {
+    String ppl =
+        "source=EMP | convert auto(SAL) AS sal_auto, num(COMM) AS comm_num, rmcomma(ENAME) AS"
+            + " name_clean, rmunit(JOB) AS job_clean, none(EMPNO) AS empno_same";
+    RelNode root = getRelNode(ppl);
+    String expectedLogical =
+        "LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5],"
+            + " COMM=[$6], DEPTNO=[$7], sal_auto=[AUTO($5)], comm_num=[NUM($6)],"
+            + " name_clean=[RMCOMMA($1)], job_clean=[RMUNIT($2)], empno_same=[$0])\n"
+            + "  LogicalTableScan(table=[[scott, EMP]])\n";
+    verifyLogical(root, expectedLogical);
+
+    String expectedSparkSql =
+        "SELECT `EMPNO`, `ENAME`, `JOB`, `MGR`, `HIREDATE`, `SAL`, `COMM`, `DEPTNO`, AUTO(`SAL`)"
+            + " `sal_auto`, NUM(`COMM`) `comm_num`, RMCOMMA(`ENAME`) `name_clean`, RMUNIT(`JOB`)"
+            + " `job_clean`, `EMPNO` `empno_same`\n"
+            + "FROM `scott`.`EMP`";
+    verifyPPLToSparkSQL(root, expectedSparkSql);
+  }
+
+  @Test
+  public void testConvertAutoWithMemoryField() {
+    String ppl = "source=EMP | convert auto(JOB) AS memory_size";
+    RelNode root = getRelNode(ppl);
+    String expectedLogical =
+        "LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5],"
+            + " COMM=[$6], DEPTNO=[$7], memory_size=[AUTO($2)])\n"
+            + "  LogicalTableScan(table=[[scott, EMP]])\n";
+    verifyLogical(root, expectedLogical);
+
+    String expectedSparkSql =
+        "SELECT `EMPNO`, `ENAME`, `JOB`, `MGR`, `HIREDATE`, `SAL`, `COMM`, `DEPTNO`, AUTO(`JOB`)"
+            + " `memory_size`\n"
+            + "FROM `scott`.`EMP`";
+    verifyPPLToSparkSQL(root, expectedSparkSql);
+  }
+}


⚠️ Potential issue | 🟠 Major

Missing required NULL-input, boundary-condition, and error-condition tests for all five conversion functions

The coding guidelines for **/+(*Test|*IT).java files explicitly mandate:

NULL input tests for all new functions

Boundary condition tests (min/max values, empty inputs)

Error condition tests (invalid inputs, exceptions)

None of these are present in this class. The five conversion functions (auto, num, rmcomma, rmunit, memk) each need at least one of each category. Examples of what should be added:

// NULL input — verifying SQL generation/plan doesn't blow up on nullable field `@Test` public void testConvertNumWithNullInput() { // COMM is a nullable field in the SCOTT EMP schema String ppl = "source=EMP | convert num(COMM)"; RelNode root = getRelNode(ppl); // assert plan handles null-bearing field correctly verifyLogical(root, ...); verifyPPLToSparkSQL(root, ...); } // Boundary condition — empty-string equivalent: applying rmcomma to a field // with potentially empty-string values `@Test` public void testConvertRmcommaWithEmptyString() { ... } // Error condition — converting a field with a nonsensical function argument // or verifying an exception is thrown on illegal input `@Test`(expected = SomeParseException.class) public void testConvertInvalidFunctionThrows() { String ppl = "source=EMP | convert unknown_fn(SAL)"; getRelNode(ppl); }

As per coding guidelines, NULL input tests must be included for all new functions and boundary condition tests (min/max values, empty inputs) for all new functions.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.java` around lines 1 - 272, The test class CalcitePPLConvertTest lacks required NULL-input, boundary-condition, and error-condition tests for the five conversion functions (auto, num, rmcomma, rmunit, memk); add unit tests that exercise each function with (1) a nullable field (NULL-input) e.g., use COMM or another nullable column and assert getRelNode(ppl) + verifyLogical(root, ...) and verifyPPLToSparkSQL(root, ...), (2) a boundary/empty input case (e.g., rmcomma applied to fields that can be empty or empty-string equivalent) and assert plan/SQL, and (3) an error-condition test that passes an invalid function or argument and expects a parse/validation exception (use `@Test`(expected = SomeParseException.class) around getRelNode(ppl)); implement these tests in CalcitePPLConvertTest using the existing helpers getRelNode, verifyLogical, and verifyPPLToSparkSQL and name them clearly like testConvertAutoWithNullInput, testConvertNumBoundaryEmpty, testConvertInvalidFunctionThrows, etc.

coderabbitai · 2026-02-19T17:42:24Z

ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.java

+  public void testConvertAutoWithMemoryField() {
+    String ppl = "source=EMP | convert auto(JOB) AS memory_size";
+    RelNode root = getRelNode(ppl);
+    String expectedLogical =
+        "LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5],"
+            + " COMM=[$6], DEPTNO=[$7], memory_size=[AUTO($2)])\n"
+            + "  LogicalTableScan(table=[[scott, EMP]])\n";
+    verifyLogical(root, expectedLogical);
+
+    String expectedSparkSql =
+        "SELECT `EMPNO`, `ENAME`, `JOB`, `MGR`, `HIREDATE`, `SAL`, `COMM`, `DEPTNO`, AUTO(`JOB`)"
+            + " `memory_size`\n"
+            + "FROM `scott`.`EMP`";
+    verifyPPLToSparkSQL(root, expectedSparkSql);
+  }


⚠️ Potential issue | 🟡 Minor

Misleading test name: testConvertAutoWithMemoryField

The test applies auto(JOB) with alias memory_size, but JOB is a text job-title field in the SCOTT schema — not a memory-related field. The name implies memory-specific behavior (e.g., "10k", "1.5M" strings) is being exercised, which it isn't. This is misleading for anyone trying to understand the test intent.

Consider renaming to testConvertAutoWithAlias (since testConvertWithAlias already covers SAL with auto, this could be testConvertAutoAliasOnStringField) or updating the test to actually use a field resembling a memory value.

✏️ Suggested rename

- public void testConvertAutoWithMemoryField() { - String ppl = "source=EMP | convert auto(JOB) AS memory_size"; + public void testConvertAutoAliasOnStringField() { + String ppl = "source=EMP | convert auto(JOB) AS job_converted";

As per coding guidelines, Test naming should follow pattern test<Functionality><Condition>.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.java` around lines 257 - 271, The test method testConvertAutoWithMemoryField is misnamed because it uses auto(JOB) aliased as memory_size while JOB is a string job-title field; rename the test to reflect aliasing behavior (e.g., testConvertAutoAliasOnStringField or testConvertAutoWithAlias) or change the input to use a memory-like field (e.g., replace JOB with a numeric/string memory_field that contains values like "10k") so the name matches behavior; update the method name testConvertAutoWithMemoryField and any associated expected strings (retain auto(JOB) and memory_size if renaming, or replace auto(JOB)/memory_size occurrences if you change the field) to keep assertions consistent.

Signed-off-by: ritvibhatt <53196324+ritvibhatt@users.noreply.github.com>

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (2)

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (1)

984-1093: Consider validating allowed convert functions.

If convert is meant to accept only the conversion UDFs (auto/num/rmcomma/rmunit/none/memk), this currently permits any single-argument function name. Consider an allowlist check (or explicitly document that arbitrary UDFs are accepted).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java`
around lines 984 - 1093, The convert implementation accepts any single-argument
function name; restrict it to the allowed conversion UDFs by adding an allowlist
check in processFunctionConversion (before calling
PPLFuncImpTable.INSTANCE.resolve): define the set of permitted names (e.g.,
"auto","num","rmcomma","rmunit","none","memk") and if functionName is not in
that set throw a SemanticCheckException with a clear message; keep the rest of
processFunctionConversion logic (replacements/additions) intact and update any
related tests or docs if you intend to allow additional functions.

ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java (1)

1206-1214: Prefer Objects::nonNull over inline null check; confirm empty-list behavior

Objects is already imported. The inline lambda .filter(conversion -> conversion != null) should be .filter(Objects::nonNull). Additionally, if every conversion is a none(field) with the same source and target, conversions will be empty. Verify that a Convert node with an empty conversion list is handled gracefully downstream (analyzer, physical operator, Calcite planner).

♻️ Proposed refactor

-    List<Let> conversions =
-        ctx.convertFunction().stream()
-            .map(this::buildConversion)
-            .filter(conversion -> conversion != null)
-            .collect(Collectors.toList());
+    List<Let> conversions =
+        ctx.convertFunction().stream()
+            .map(this::buildConversion)
+            .filter(Objects::nonNull)
+            .collect(Collectors.toList());

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java` around lines
1206 - 1214, Replace the inline null-check filter in visitConvertCommand by
using Objects::nonNull (i.e., change .filter(conversion -> conversion != null)
to .filter(Objects::nonNull)) so the stream is cleaner; while here, confirm
behavior when buildConversion yields no conversions (e.g., all none(field)) by
verifying downstream handling of a Convert constructed with an empty conversions
list—trace usages of Convert and the conversions list through the analyzer,
physical operator builders, and Calcite planner and either ensure they accept an
empty list or add a guard in visitConvertCommand to produce an appropriate
no-op/validation/error if empty conversions are invalid.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@integ-test/src/test/java/org/opensearch/sql/security/CalciteCrossClusterSearchIT.java`:
- Around line 460-461: Add a missing closing brace to end the method
testCrossClusterConvertWithAlias and add the JUnit `@Test` annotation immediately
before the method declaration of testCrossClusterMvExpandBasic so the first
method is properly closed and the second is recognized as a test; locate the
methods testCrossClusterConvertWithAlias and testCrossClusterMvExpandBasic in
CalciteCrossClusterSearchIT and insert a single closing '}' after the
verifySchema(...) call and add '@Test' on the line directly above the public
void testCrossClusterMvExpandBasic() declaration.

In `@ppl/src/main/antlr/OpenSearchPPLParser.g4`:
- Around line 543-549: Add negative/grammar-level tests for
convertCommand/convertFunction in CalcitePPLConvertTest: create test cases that
feed malformed convert syntaxes (e.g., missing field argument inside
LT_PRTHS...RT_PRTHS, malformed function names, missing LT_PRTHS/RT_PRTHS,
stray/extra COMMA, or misplaced AS alias) and assert the parser fails (throws a
RecognitionException or records parse errors) rather than building an AST;
reference convertCommand, convertFunction, visitConvertCommand, and
buildConversion to ensure the tests target parsing failures (not just
semantic/AST logic) and cover edge cases alongside the existing 14 positive
tests.

In `@ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java`:
- Around line 1225-1229: The current check for the "none" no-op uses toString()
comparison between fieldArg and targetField.getField(), which is fragile;
instead detect the no-op by checking the function context alias (funcCtx.alias
== null) because determineTargetField returns the same field when no alias is
provided. Replace the ternary that returns null or new Let(targetField,
fieldArg) with a condition that if "none".equalsIgnoreCase(functionName) and
funcCtx.alias == null then return null, otherwise return new Let(targetField,
fieldArg); update the block referencing functionName, fieldArg, targetField (and
Let) accordingly so the redundant new Let(field, field) is not created when
alias is absent.
- Around line 1221-1231: The parser extracts functionName via
funcCtx.functionName.getText() but passes the raw string into AstDSL.function;
keep the equalsIgnoreCase("none") check as-is but normalize the name before
creating the function node: compute a normalizedName (e.g.,
functionName.toLowerCase()) and use that when calling
AstDSL.function(normalizedName, fieldArg) so the Let creation and
AstDSL.function use a consistent normalized function name (references:
functionName, funcCtx.functionName.getText(), AstDSL.function,
determineTargetField, Let).

In `@ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java`:
- Around line 463-495: In visitConvert, handle the case where
conversion.getExpression() is a Field (the "none(field) AS alias" case) so
functionName isn’t left empty: if conversion.getExpression() is a Field, set
functionName = "none", set actualSourceField to that Field's string, and set
fields to MASK_COLUMN (so the resulting format becomes "none(***) AS ***"); keep
the existing logic for Function nodes unchanged and preserve the existing
asClause/targetField logic that compares targetField to actualSourceField.

---

Duplicate comments:
In `@integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java`:
- Around line 232-241: The test currently only verifies the result inside the
catch block, so when executeQuery succeeds the test does not assert anything;
modify testConvertCommand to capture the successful response into the existing
JSONObject result and call verifyQuery(result) on the success path as well:
inside the try block assign the response from executeQuery to result (or parse
the returned response body into result using TestUtils.getResponseBody if
executeQuery returns a response object) and then call verifyQuery(result); keep
the existing catch block for ResponseException to preserve the error-path
behavior.

In `@ppl/src/main/antlr/OpenSearchPPLLexer.g4`:
- Line 164: The TIMEFORMAT lexer token is currently unused in the grammar;
either remove its definition from OpenSearchPPLLexer.g4 or wire it into parser
rules that consume it (e.g., add a parser rule like timeFormatSpec : TIMEFORMAT
... or use TIMEFORMAT in existing rules such as functionCall, literal, or format
clauses). Locate the TIMEFORMAT token in OpenSearchPPLLexer.g4 and decide:
delete the 'TIMEFORMAT' token line and run a project-wide search for TIMEFORMAT
to ensure no references remain, or add the appropriate parser rule(s) and update
any callers (e.g., format-related rules) so the token is actually consumed by
the parser.

---

Nitpick comments:
In `@core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java`:
- Around line 984-1093: The convert implementation accepts any single-argument
function name; restrict it to the allowed conversion UDFs by adding an allowlist
check in processFunctionConversion (before calling
PPLFuncImpTable.INSTANCE.resolve): define the set of permitted names (e.g.,
"auto","num","rmcomma","rmunit","none","memk") and if functionName is not in
that set throw a SemanticCheckException with a clear message; keep the rest of
processFunctionConversion logic (replacements/additions) intact and update any
related tests or docs if you intend to allow additional functions.

In `@ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java`:
- Around line 1206-1214: Replace the inline null-check filter in
visitConvertCommand by using Objects::nonNull (i.e., change .filter(conversion
-> conversion != null) to .filter(Objects::nonNull)) so the stream is cleaner;
while here, confirm behavior when buildConversion yields no conversions (e.g.,
all none(field)) by verifying downstream handling of a Convert constructed with
an empty conversions list—trace usages of Convert and the conversions list
through the analyzer, physical operator builders, and Calcite planner and either
ensure they accept an empty list or add a guard in visitConvertCommand to
produce an appropriate no-op/validation/error if empty conversions are invalid.

coderabbitai · 2026-02-21T01:16:17Z

integ-test/src/test/java/org/opensearch/sql/security/CalciteCrossClusterSearchIT.java

+    verifySchema(result, schema("balance_num", "double"));
  public void testCrossClusterMvExpandBasic() throws IOException {


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

fd -I "CalciteCrossClusterSearchIT.java" --exec sed -n '455,475p' {}

Repository: opensearch-project/sql

Length of output: 1021

Compile error: missing closing } for testCrossClusterConvertWithAlias; @Test annotation also missing from testCrossClusterMvExpandBasic.

The file is missing two required lines between line 460 and line 461:

A closing brace } to end testCrossClusterConvertWithAlias — without it the method declaration testCrossClusterMvExpandBasic is parsed as a statement inside testCrossClusterConvertWithAlias's body, causing a Java compilation error.

The @Test annotation before testCrossClusterMvExpandBasic — without it the method will not be recognized as a test and will not run.

Proposed fix

verifySchema(result, schema("balance_num", "double")); + } + + `@Test` public void testCrossClusterMvExpandBasic() throws IOException {

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

verifySchema(result, schema("balance_num", "double"));

public void testCrossClusterMvExpandBasic() throws IOException {

verifySchema(result, schema("balance_num", "double"));

}

`@Test`

public void testCrossClusterMvExpandBasic() throws IOException {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@integ-test/src/test/java/org/opensearch/sql/security/CalciteCrossClusterSearchIT.java` around lines 460 - 461, Add a missing closing brace to end the method testCrossClusterConvertWithAlias and add the JUnit `@Test` annotation immediately before the method declaration of testCrossClusterMvExpandBasic so the first method is properly closed and the second is recognized as a test; locate the methods testCrossClusterConvertWithAlias and testCrossClusterMvExpandBasic in CalciteCrossClusterSearchIT and insert a single closing '}' after the verifySchema(...) call and add '@Test' on the line directly above the public void testCrossClusterMvExpandBasic() declaration.

coderabbitai · 2026-02-21T01:16:17Z

ppl/src/main/antlr/OpenSearchPPLParser.g4

+convertCommand
+   : CONVERT convertFunction (COMMA? convertFunction)*
+   ;
+
+convertFunction
+   : functionName = ident LT_PRTHS fieldExpression RT_PRTHS (AS alias = fieldExpression)?
+   ;


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

rg -n 'convertCommand|convertFunction|visitConvert' -g '*.java' -g '*.g4' -A 2

Repository: opensearch-project/sql

Length of output: 7268

🏁 Script executed:

rg -n 'convert' -g '*Test*.java' -g '*Parser*.g4' --max-count 30

Repository: opensearch-project/sql

Length of output: 42406

🏁 Script executed:

# Search for AST builder classes and convert-related implementations find . -name '*Parser*.java' -o -name '*Visitor*.java' -o -name '*Builder*.java' | grep -i ast | head -20

Repository: opensearch-project/sql

Length of output: 1664

🏁 Script executed:

# Check for existing command patterns in grammar file rg -n 'Command\s*:' ppl/src/main/antlr/OpenSearchPPLParser.g4 -A 5 | head -50

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Check the buildConversion method implementation rg -n 'buildConversion' ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java -A 15

Repository: opensearch-project/sql

Length of output: 1059

🏁 Script executed:

# Check CalcitePPLConvertTest to see test cases and coverage cat ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.java | head -100

Repository: opensearch-project/sql

Length of output: 3902

🏁 Script executed:

# Look for negative/error test cases for convert rg -n 'convert.*invalid|convert.*error|convert.*negative|convert.*exception' ppl/src/test -i

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Check for grammar rule validation tests rg -n 'convertFunction|convertCommand' ppl/src/test/java -B 2 -A 5

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Get full CalcitePPLConvertTest file to see all test methods wc -l ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.java

Repository: opensearch-project/sql

Length of output: 145

🏁 Script executed:

# See all test methods in CalcitePPLConvertTest rg -n '@Test|public void test' ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.java

Repository: opensearch-project/sql

Length of output: 926

🏁 Script executed:

# Check for any exception/negative tests or edge cases grep -n 'throw\|Exception\|assertThrow\|invalid\|error' ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.java

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Look for parser-level tests (grammar-level tests) find ppl -path '*test*' -name '*Parser*Test*.java' -o -path '*test*' -name '*Grammar*Test*.java' | xargs grep -l 'convert' 2>/dev/null || echo "No parser grammar tests found"

Repository: opensearch-project/sql

Length of output: 95

Add negative/grammar-level tests for convertCommand syntax validation.

AST builder support is confirmed—visitConvertCommand properly handles convertFunction contexts and aliasing via buildConversion. However, CalcitePPLConvertTest covers only positive cases (14 test methods). Add tests for invalid syntax (e.g., missing field argument, malformed function names, edge cases) at the parser/grammar level to validate error handling during parsing.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@ppl/src/main/antlr/OpenSearchPPLParser.g4` around lines 543 - 549, Add negative/grammar-level tests for convertCommand/convertFunction in CalcitePPLConvertTest: create test cases that feed malformed convert syntaxes (e.g., missing field argument inside LT_PRTHS...RT_PRTHS, malformed function names, missing LT_PRTHS/RT_PRTHS, stray/extra COMMA, or misplaced AS alias) and assert the parser fails (throws a RecognitionException or records parse errors) rather than building an AST; reference convertCommand, convertFunction, visitConvertCommand, and buildConversion to ensure the tests target parsing failures (not just semantic/AST logic) and cover edge cases alongside the existing 14 positive tests.

coderabbitai · 2026-02-21T01:16:17Z

ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java

+    String functionName = funcCtx.functionName.getText();
+    UnresolvedExpression fieldArg = internalVisitExpression(funcCtx.fieldExpression(0));
+    Field targetField = determineTargetField(funcCtx, fieldArg);
+
+    if ("none".equalsIgnoreCase(functionName)) {
+      return fieldArg.toString().equals(targetField.getField().toString())
+          ? null
+          : new Let(targetField, fieldArg);
+    }
+
+    return new Let(targetField, AstDSL.function(functionName, fieldArg));


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, locate key files related to function resolution and UDF registry find . -type f -name "*.java" | grep -iE "(function|resolver|udf|registry)" | head -20

Repository: opensearch-project/sql

Length of output: 1820

🏁 Script executed:

# Search for AstDSL.function usage to understand the expected input rg -n "AstDSL\.function" --type=java -B 2 -A 2 | head -50

Repository: opensearch-project/sql

Length of output: 5510

🏁 Script executed:

# Look for FunctionRegistry or similar registration mechanisms rg -n "class.*FunctionRegistry\|class.*FunctionResolver\|registerFunction" --type=java -B 1 -A 5

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Check the buildConversion method context and surrounding code cat -n ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java | sed -n '1215,1235p'

Repository: opensearch-project/sql

Length of output: 1052

🏁 Script executed:

# Find the Function class definition to understand how it's used find . -type f -name "Function.java" -path "*/ast/*" | head -5

Repository: opensearch-project/sql

Length of output: 134

🏁 Script executed:

# Look for PPL-specific function resolution or analysis find . -type f -name "*.java" -path "*/ppl/*" | grep -iE "(analysis|resolver|function)" | head -20

Repository: opensearch-project/sql

Length of output: 1594

🏁 Script executed:

# Search for where convert functions are defined/used in PPL rg -n "convert\|AUTO\|HEX" --type=java -i ppl/src/ -B 2 -A 2 | head -80

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Check how Function expressions are resolved during analysis rg -n "resolveFunction\|visitFunction\|analyzeFunction" --type=java -B 2 -A 5 | head -80

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Read the Function class to understand function name handling cat -n core/src/main/java/org/opensearch/sql/ast/expression/Function.java | head -100

Repository: opensearch-project/sql

Length of output: 1515

🏁 Script executed:

# Check the ConversionFunctionIT test to see how convert is tested cat -n integ-test/src/test/java/org/opensearch/sql/ppl/ConversionFunctionIT.java | head -150

Repository: opensearch-project/sql

Length of output: 3867

🏁 Script executed:

# Look for PPL analysis/type resolution code find ./ppl -type f -name "*.java" -path "*/analysis/*" -o -path "*/calcite/*" | grep -v test | head -15

Repository: opensearch-project/sql

Length of output: 147

🏁 Script executed:

# Search for actual "convert" function usage in tests rg -n "convert\s*\(" --type=java -i ppl/src/ integ-test/src/ -B 2 -A 2 | head -100

Repository: opensearch-project/sql

Length of output: 4457

🏁 Script executed:

# Look for the PPL analyzer/visitor that processes Function nodes find ./ppl -type f -name "*Analyzer.java" -o -name "*Visitor.java" | grep -v test

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Check how functions are handled in PPL's expression analysis rg -n "visitFunction\|analyzeFunction" --type=java -i ppl/src/ -B 2 -A 5 | head -100

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Look for function resolution in the core module rg -n "resolveFunction\|getFunctionName\|toLowerCase" --type=java core/src/ -B 2 -A 2 | grep -iE "(function|name)" | head -50

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Find the Let class definition find . -type f -name "Let.java" | head -5

Repository: opensearch-project/sql

Length of output: 129

🏁 Script executed:

# Search for convert function implementations (rmcomma, rmunit, auto, hex, etc.) rg -n "rmcomma\|rmunit\|auto\|hex" --type=java -i | grep -iE "convert|function|class" | head -50

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Look for how Let expressions and functions are analyzed in PPL rg -n "class.*PPLAnalyzer\|visitLet\|visitFunction" --type=java ppl/src/ -B 1 -A 5 | head -100

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Check if there's any function registry in PPL find ./ppl -type f -name "*.java" | xargs grep -l "FunctionRegistry\|registerFunction" | head -10

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Read the Let class to understand how it stores the function cat -n core/src/main/java/org/opensearch/sql/ast/expression/Let.java

Repository: opensearch-project/sql

Length of output: 2384

🏁 Script executed:

# Search for where convert functions are actually defined/resolved rg -n "rmcomma\|rmunit\|convert" --type=java ppl/src/ -i | head -30

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Look for how functions are processed in PPL after AST parsing rg -n "class PPL" --type=java ppl/src/main/java | head -10

Repository: opensearch-project/sql

Length of output: 640

🏁 Script executed:

# Check what happens in the visitor that processes Function expressions rg -n "public.*Function\|visitFunction" --type=java core/src/main/java/org/opensearch/sql -B 1 -A 5 | head -80

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Look for where Function expressions are converted to Calcite nodes rg -n "visitFunction\|Function.*Calcite\|RexCall" --type=java ppl/src/ core/src/ -B 2 -A 3 | head -100

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Search for Calcite function resolution in core rg -n "FunctionRegistry\|SqlOperator\|function.*toLowerCase" --type=java core/src/ -i -B 2 -A 2 | head -80

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Check the grammar file for convert command find . -type f -name "*.g4" | xargs grep -n "convert\|functionName" -i

Repository: opensearch-project/sql

Length of output: 28202

🏁 Script executed:

# Look for where functions get passed to Calcite's function registry rg -n "getFunction\|getFunctionRegistry\|resolveFunction" --type=java -i | head -60

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Find the ident rule definition in PPL grammar rg -n "^ident\s*:" ppl/src/main/antlr/OpenSearchPPLParser.g4 -A 3

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Look for where Function expressions are processed in PPL/Calcite pipeline rg -n "class.*ExpressionAnalyzer\|visitExpression\|Function.*visitor" ppl/src/ --type=java -B 2 -A 5 | head -80

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Check for Calcite's RexBuilder or function conversion rg -n "RexBuilder\|createCall\|getFunctionRegistry\|lookupFunction" --type=java ppl/src/ core/src/ -i -B 2 -A 3 | head -80

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Look for toLowerCase in function processing rg -n "toLowerCase\|toUpperCase\|normalize.*function" --type=java ppl/src/main/java -i -B 2 -A 2

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Look at tests to see if convert functions work with mixed case grep -r "convert.*[A-Z]" ppl/src/test --include="*.java" -i | head -20

Repository: opensearch-project/sql

Length of output: 2735

🏁 Script executed:

# Check the SQL parser to see how it handles function names rg -n "functionName\|function.*toLowerCase" sql/src/main/java --type=java -B 2 -A 2 | head -60

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Search for AstBuilder or function visitor in core that might normalize rg -n "visitFunction\|Function.*Visitor" core/src/main/java --type=java -B 2 -A 8 | head -100

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Look for UnifiedFunctionRepository which might be the registry cat -n api/src/main/java/org/opensearch/sql/api/function/UnifiedFunctionRepository.java | head -100

Repository: opensearch-project/sql

Length of output: 3473

🏁 Script executed:

# Read the convert test to see what function names are used and tested cat -n ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.java

Repository: opensearch-project/sql

Length of output: 12886

🏁 Script executed:

# Search for where convert function names are actually resolved/validated rg -n "auto\|rmcomma\|rmunit" --type=java ppl/src/ core/src/ -B 3 -A 3 | head -100

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Check how the Function expression in the AST is processed to become a Calcite expression rg -n "visitFunction\|ExpressionConverter" --type=java ppl/src/main/java -B 2 -A 8 | head -100

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Look at the end of CalcitePPLConvertTest more carefully and check for any case-sensitive tests tail -n 50 ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.java

Repository: opensearch-project/sql

Length of output: 2289

🏁 Script executed:

# Search for where Function expressions are converted to Calcite SqlCall/RexCall rg -n "class.*RelBuilder\|class.*ExpressionConverter\|public.*visit.*Function" ppl/src/main/java --type=java -B 2 -A 10 | head -120

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Look for PPLBuiltinOperators to see how convert functions are registered find . -type f -name "PPLBuiltinOperators.java" -o -name "*BuiltinOperators.java" | head -5

Repository: opensearch-project/sql

Length of output: 150

🏁 Script executed:

# Check if there are any tests with mixed-case function names find ppl/src/test -name "*.java" -exec grep -l "convert.*[A-Z][a-z]\|AUTO\|Auto" {} \;

Repository: opensearch-project/sql

Length of output: 615

🏁 Script executed:

# Read PPLBuiltinOperators to see how convert functions are registered cat -n core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java | head -200

Repository: opensearch-project/sql

Length of output: 13606

🏁 Script executed:

# Look for where function names are normalized (uppercase in tests) rg -n "toUpperCase\|toLowerCase\|getOperatorName\|getName" core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java --type=java -B 2 -A 2

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Search for where Calcite function resolution happens for PPL rg -n "class.*PPL.*Analyzer\|PPL.*TypeResolver\|getFunction" ppl/src/main/java --type=java -B 2 -A 5 | head -80

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Continue reading PPLBuiltinOperators to see convert function registrations sed -n '200,400p' core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java

Repository: opensearch-project/sql

Length of output: 9729

🏁 Script executed:

# Search for where AUTO, NUM, RMCOMMA, RMUNIT functions are registered rg -n "AUTO\|RMCOMMA\|RMUNIT" core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java --type=java

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Look at one of the convert function implementations cat -n core/src/main/java/org/opensearch/sql/expression/function/udf/AutoConvertFunction.java | head -60

Repository: opensearch-project/sql

Length of output: 953

🏁 Script executed:

# Search for all tests with mixed-case convert functions find ppl/src/test integ-test/src/test -name "*.java" -exec grep -l "convert.*[A-Z][a-z]\|convert.*AUTO\|convert.*Auto" {} \;

Repository: opensearch-project/sql

Length of output: 1359

🏁 Script executed:

# Look at how the PPLBuiltinOperators functions are registered - check for AUTO, NUM, etc. sed -n '300,500p' core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java | grep -E "AUTO|NUM|RMCOMMA|RMUNIT|MEMK"

Repository: opensearch-project/sql

Length of output: 750

🏁 Script executed:

# Check if there's a resolver/analyzer that uses UnifiedFunctionRepository for PPL rg -n "UnifiedFunctionRepository\|loadFunction" ppl/src/main/java --type=java -B 3 -A 3

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Look for where Function nodes from AST are converted to Calcite operators rg -n "class.*ExpressionConverter\|convertFunction\|Function.*expression" ppl/src/main/java --type=java -B 2 -A 5 | head -100

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Check the ConvertCommandIT to see if there are any mixed-case tests cat -n integ-test/src/test/java/org/opensearch/sql/ppl/ConvertCommandIT.java | head -150

Repository: opensearch-project/sql

Length of output: 2897

🏁 Script executed:

# Look for where Calcite's operator table is used in PPL rg -n "getOperatorList\|getSqlOperator\|operatorTable" ppl/src/main/java --type=java -B 2 -A 5

Repository: opensearch-project/sql

Length of output: 48

🏁 Script executed:

# Check if there's any test with mixed-case convert function names grep -r "convert.*[A-Z][a-z]\|[Aa]uto\|[Nn]um" integ-test/src/test/java/org/opensearch/sql/ppl/ConvertCommandIT.java

Repository: opensearch-project/sql

Length of output: 939

🏁 Script executed:

# Look at PPL FunctionTypeTest cat -n ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLFunctionTypeTest.java | head -100

Repository: opensearch-project/sql

Length of output: 4725

Normalize function name to lowercase before passing to AstDSL.function

The function name from funcCtx.functionName.getText() is passed to AstDSL.function() without normalization. While the "none" check correctly uses equalsIgnoreCase(), the function node itself uses the raw, un-normalized string. PPL's convert functions are registered in uppercase (AUTO, NUM, RMCOMMA, RMUNIT, MEMK) in PPLBuiltinOperators. Although downstream case-insensitive resolution may mask this issue, normalizing at parse time ensures consistency and prevents potential failures if the resolver behavior changes.

🛠️ Suggested fix

- String functionName = funcCtx.functionName.getText(); + String functionName = funcCtx.functionName.getText().toLowerCase(Locale.ROOT);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java` around lines 1221 - 1231, The parser extracts functionName via funcCtx.functionName.getText() but passes the raw string into AstDSL.function; keep the equalsIgnoreCase("none") check as-is but normalize the name before creating the function node: compute a normalizedName (e.g., functionName.toLowerCase()) and use that when calling AstDSL.function(normalizedName, fieldArg) so the Let creation and AstDSL.function use a consistent normalized function name (references: functionName, funcCtx.functionName.getText(), AstDSL.function, determineTargetField, Let).

coderabbitai · 2026-02-21T01:16:17Z

ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java

+    if ("none".equalsIgnoreCase(functionName)) {
+      return fieldArg.toString().equals(targetField.getField().toString())
+          ? null
+          : new Let(targetField, fieldArg);
+    }


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Find the Field class definition find . -type f -name "Field.java" | head -20

Repository: opensearch-project/sql

Length of output: 198

🏁 Script executed:

# Search for the Field class in the ast/expression directory rg -l "class Field" --type=java

Repository: opensearch-project/sql

Length of output: 946

🏁 Script executed:

# Check the determineTargetField method and surrounding context in AstBuilder.java rg -A 20 "determineTargetField" ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java | head -50

Repository: opensearch-project/sql

Length of output: 1329

🏁 Script executed:

# Get the full context around lines 1225-1229 in AstBuilder.java sed -n '1200,1250p' ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java

Repository: opensearch-project/sql

Length of output: 2124

🏁 Script executed:

# Read the Field class definition cat -n ./core/src/main/java/org/opensearch/sql/ast/expression/Field.java | head -100

Repository: opensearch-project/sql

Length of output: 1572

🏁 Script executed:

# Check for Lombok annotations and full class structure rg -A 50 "public class Field" ./core/src/main/java/org/opensearch/sql/ast/expression/Field.java

Repository: opensearch-project/sql

Length of output: 813

🏁 Script executed:

# Look for QualifiedName class to understand what targetField.getField() is find . -type f -name "QualifiedName.java" | grep -v legacy

Repository: opensearch-project/sql

Length of output: 139

🏁 Script executed:

# Read QualifiedName class cat -n ./core/src/main/java/org/opensearch/sql/ast/expression/QualifiedName.java | head -80

Repository: opensearch-project/sql

Length of output: 2839

🏁 Script executed:

# Check what toString() implementations exist in QualifiedName rg -A 10 "toString()" ./core/src/main/java/org/opensearch/sql/ast/expression/QualifiedName.java

Repository: opensearch-project/sql

Length of output: 332

🏁 Script executed:

# Get more context around line 1225 to understand fieldArg type better sed -n '1205,1235p' ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java

Repository: opensearch-project/sql

Length of output: 1256

Use funcCtx.alias == null to detect none no-op instead of fragile toString() comparison

The Field class uses Lombok's @ToString annotation, which generates a representation like Field(field=..., fieldArgs=[...]). When fieldArg is a Field and targetField.getField() is a QualifiedName (which has a custom toString() returning the plain field name), these string representations are fundamentally different and will never be equal. This means the null branch is never taken, and none(field) without an alias always produces a redundant new Let(field, field) instead of being filtered out.

Since determineTargetField returns the same field object when no alias is provided, the correct no-op condition is simply:

🛠️ Proposed fix

if ("none".equalsIgnoreCase(functionName)) { - return fieldArg.toString().equals(targetField.getField().toString()) - ? null - : new Let(targetField, fieldArg); + // When no alias, none(field) is a no-op (source == target); drop it. + return funcCtx.alias == null ? null : new Let(targetField, fieldArg); }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java` around lines 1225 - 1229, The current check for the "none" no-op uses toString() comparison between fieldArg and targetField.getField(), which is fragile; instead detect the no-op by checking the function context alias (funcCtx.alias == null) because determineTargetField returns the same field when no alias is provided. Replace the ternary that returns null or new Let(targetField, fieldArg) with a condition that if "none".equalsIgnoreCase(functionName) and funcCtx.alias == null then return null, otherwise return new Let(targetField, fieldArg); update the block referencing functionName, fieldArg, targetField (and Let) accordingly so the redundant new Let(field, field) is not created when alias is absent.

coderabbitai · 2026-02-21T01:16:17Z

ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java

+  @Override
+  public String visitConvert(Convert node, String context) {
+    String child = node.getChild().get(0).accept(this, context);
+    String conversions =
+        node.getConversions().stream()
+            .map(
+                conversion -> {
+                  String functionName = "";
+                  String fields = MASK_COLUMN;
+                  String actualSourceField = "";
+
+                  if (conversion.getExpression() instanceof Function) {
+                    Function func = (Function) conversion.getExpression();
+                    functionName = func.getFuncName().toLowerCase(Locale.ROOT);
+                    if (!func.getFuncArgs().isEmpty()
+                        && func.getFuncArgs().get(0) instanceof Field) {
+                      actualSourceField = ((Field) func.getFuncArgs().get(0)).getField().toString();
+                    }
+                    fields =
+                        func.getFuncArgs().stream()
+                            .map(arg -> MASK_COLUMN)
+                            .collect(Collectors.joining(","));
+                  }
+
+                  String targetField = conversion.getVar().getField().toString();
+
+                  String asClause =
+                      !targetField.equals(actualSourceField) ? " AS " + MASK_COLUMN : "";
+                  return StringUtils.format("%s(%s)%s", functionName, fields, asClause);
+                })
+            .collect(Collectors.joining(","));
+    return StringUtils.format("%s | convert %s", child, conversions);
+  }


⚠️ Potential issue | 🟡 Minor

none(field) AS alias conversions produce an empty function name in the anonymized output

When buildConversion in AstBuilder handles a none function with an alias, it stores the raw fieldArg (Field) — not a Function — as the Let expression. Here in visitConvert, the instanceof Function guard is never satisfied for that case, so functionName remains "" and the output is "(***) AS ***" instead of "none(***) AS ***".

🛠️ Suggested fix in the anonymizer

if (conversion.getExpression() instanceof Function) { Function func = (Function) conversion.getExpression(); functionName = func.getFuncName().toLowerCase(Locale.ROOT); if (!func.getFuncArgs().isEmpty() && func.getFuncArgs().get(0) instanceof Field) { actualSourceField = ((Field) func.getFuncArgs().get(0)).getField().toString(); } fields = func.getFuncArgs().stream() .map(arg -> MASK_COLUMN) .collect(Collectors.joining(",")); + } else if (conversion.getExpression() instanceof Field sourceField) { + // none(field) AS alias — expression is the raw field, function name is "none" + functionName = "none"; + actualSourceField = sourceField.getField().toString(); }

Alternatively, in AstBuilder.buildConversion, always wrap the none-with-alias case in AstDSL.function("none", fieldArg) (matching the other functions), so the anonymizer sees a uniform Function node.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java` around lines 463 - 495, In visitConvert, handle the case where conversion.getExpression() is a Field (the "none(field) AS alias" case) so functionName isn’t left empty: if conversion.getExpression() is a Field, set functionName = "none", set actualSourceField to that Field's string, and set fields to MASK_COLUMN (so the resulting format becomes "none(***) AS ***"); keep the existing logic for Function nodes unchanged and preserve the existing asClause/targetField logic that compares targetField to actualSourceField.

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

penghuo · 2026-02-23T18:47:34Z

core/src/main/java/org/opensearch/sql/expression/function/udf/AutoConvertFunction.java

+  }
+
+  public static Object convert(Object value) {
+    return new AutoConvertFunction().convertValue(value);


Each static convert() method creates a new UDF instance per row (new AutoConvertFunction().convertValue(value)). Additionally, AutoConvertFunction.applyConversion creates new NumConvertFunction() per row at line 26. For large datasets this produces 2 object allocations per row per convert call. Since convertValue and applyConversion are stateless, use a cached singleton instance instead (e.g. private static final AutoConvertFunction INSTANCE = new AutoConvertFunction()). Same pattern applies to all 5 UDF classes.

Updated to use cached singleton instance instead of creating new instance every row for the 5 UDF classes

docs/user/ppl/cmd/convert.md

penghuo · 2026-02-23T18:53:33Z

core/src/main/java/org/opensearch/sql/expression/function/udf/BaseConversionUDF.java

+      Pattern.compile("^([+-]?(?:\\d+\\.?\\d*|\\.\\d+)(?:[eE][+-]?\\d+)?)(.*)$");
+  private static final Pattern CONTAINS_LETTER_PATTERN = Pattern.compile(".*[a-zA-Z].*");
+  private static final Pattern STARTS_WITH_SIGN_OR_DIGIT = Pattern.compile("^[+-]?[\\d.].*");
+  private static final Pattern MEMK_PATTERN = Pattern.compile("^([+-]?\\d+\\.?\\d*)([kmgKMG])?$");


memk("512 m") (with space before unit) returns null. Is this the intended Splunk compatible behavior?

Yes, Splunk also returns null when there is a space between the value and unit

penghuo · 2026-02-23T19:22:07Z

ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java

+      throw new IllegalArgumentException("Convert function requires a field argument");
+    }
+
+    String functionName = funcCtx.functionName.getText();


convert currently accepts any identifier as a function name and forwards it without semantic validation, so unsupported/non-convert functions (for example convert lower(field)) are accepted even though the command contract documents a fixed set (auto/num/rmcomma/rmunit/memk/none). This makes convert behave like a generic single-arg eval and can silently produce non-numeric output types, so function names should be rejected early unless they are in the supported conversion list.

Updated to only accept supported convert functions

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

coderabbitai

♻️ Duplicate comments (5)

core/src/main/java/org/opensearch/sql/expression/function/udf/MemkConvertFunction.java (1)

13-18: ⚠️ Potential issue | 🟡 Minor

Add JavaDoc for the public constructor and convert entrypoint.

These public methods still lack JavaDoc with @param, @return, and @throws.
As per coding guidelines: "core/src/main/java/**/*.java: Public methods MUST have JavaDoc with @param, @return, and @throws".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/MemkConvertFunction.java`
around lines 13 - 18, Add JavaDoc for the public constructor
MemkConvertFunction() and the public static entrypoint convert(Object value):
document the constructor and the convert method with a brief description,
include `@param` for "value", `@return` describing the returned Object (and its
semantics), and `@throws` for any exceptions the method may propagate (e.g.,
IllegalArgumentException or any specific runtime exceptions thrown by
INSTANCE.convertValue). Place the JavaDoc immediately above the constructor and
the convert method declarations so they comply with the project's "Public
methods MUST have JavaDoc" rule.

ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java (2)

1225-1244: ⚠️ Potential issue | 🟡 Minor

Normalize conversion function names before building the function node.

You validate using toLowerCase() but pass the raw functionName into AstDSL.function(...). If resolution is case-sensitive, CONVERT AUTO(field) could fail or diverge from registered operator naming. Normalize once and use the normalized name in the AST.

🛠️ Suggested fix

-    String functionName = funcCtx.functionName.getText();
+    String functionName = funcCtx.functionName.getText();
+    String normalizedFunctionName = functionName.toLowerCase(Locale.ROOT);
...
-    if (!SUPPORTED_CONVERSION_FUNCTIONS.contains(functionName.toLowerCase())) {
+    if (!SUPPORTED_CONVERSION_FUNCTIONS.contains(normalizedFunctionName)) {
       throw new IllegalArgumentException(
           String.format(
               "Unsupported conversion function '%s'. Supported functions are: %s",
               functionName, SUPPORTED_CONVERSION_FUNCTIONS));
     }
...
-    return new Let(targetField, AstDSL.function(functionName, fieldArg));
+    return new Let(targetField, AstDSL.function(normalizedFunctionName, fieldArg));

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java` around lines
1225 - 1244, The code validates functionName using toLowerCase() against
SUPPORTED_CONVERSION_FUNCTIONS but still uses the raw functionName when building
the AST; normalize the name once (e.g., lowercased normalizedName) after reading
funcCtx.functionName.getText(), use normalizedName for the contains check, for
the "none" branch comparison, and pass normalizedName into AstDSL.function(...)
when constructing the Let node (leave usage of internalVisitExpression and
determineTargetField unchanged).

1238-1242: ⚠️ Potential issue | 🟡 Minor

Fix none no-op detection to avoid redundant Let.

The toString() comparison between fieldArg and targetField.getField() is fragile and will never match for Field vs QualifiedName. This makes none(field) always emit new Let(field, field) instead of being a no-op. Use the alias presence to detect no-op instead.

🛠️ Suggested fix

-    if ("none".equalsIgnoreCase(functionName)) {
-      return fieldArg.toString().equals(targetField.getField().toString())
-          ? null
-          : new Let(targetField, fieldArg);
-    }
+    if ("none".equalsIgnoreCase(functionName)) {
+      // When no alias is provided, none(field) is a no-op.
+      return funcCtx.alias == null ? null : new Let(targetField, fieldArg);
+    }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java` around lines
1238 - 1242, The current "none" branch in AstBuilder uses a fragile toString()
comparison between fieldArg and targetField.getField() and thus emits new
Let(...) even for no-ops; change the check to detect a true no-op by inspecting
fieldArg's alias instead: in the if ("none".equalsIgnoreCase(functionName))
branch, return null when fieldArg has an alias that matches the target field
(i.e., detect alias presence on fieldArg and compare it to the
targetField.getField() identity/name) and only create new Let(targetField,
fieldArg) when that alias-based no-op check fails.

core/src/main/java/org/opensearch/sql/expression/function/udf/AutoConvertFunction.java (1)

13-18: ⚠️ Potential issue | 🟡 Minor

Add JavaDoc for the public constructor and convert entrypoint.

These public methods still lack JavaDoc with @param, @return, and @throws.
As per coding guidelines: "core/src/main/java/**/*.java: Public methods MUST have JavaDoc with @param, @return, and @throws".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/AutoConvertFunction.java`
around lines 13 - 18, Add JavaDoc comments for the public constructor
AutoConvertFunction() and the public static entrypoint convert(Object value):
document the parameters with `@param`, the return value with `@return`, and any
possible exceptions thrown with `@throws` (e.g., if convertValue can throw
IllegalArgumentException or other runtime exceptions) following the project's
JavaDoc guidelines; place the JavaDoc immediately above the
AutoConvertFunction() constructor and above the convert(Object) method and
ensure wording describes purpose, parameter semantics, return type, and
conditions that lead to exceptions.

core/src/main/java/org/opensearch/sql/expression/function/udf/RmcommaConvertFunction.java (1)

13-18: ⚠️ Potential issue | 🟡 Minor

Add JavaDoc for the public constructor and convert entrypoint.

These public methods still lack JavaDoc with @param, @return, and @throws.
As per coding guidelines: "core/src/main/java/**/*.java: Public methods MUST have JavaDoc with @param, @return, and @throws".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/RmcommaConvertFunction.java`
around lines 13 - 18, Add JavaDoc for the public RmcommaConvertFunction()
constructor and the public static convert(Object value) entrypoint: document
parameters with `@param` (describe "value"), describe the return value with
`@return`, and declare any runtime exceptions thrown (or document that none are
thrown) with `@throws`; reference the class RmcommaConvertFunction, the
constructor RmcommaConvertFunction(), and the convert(Object) method (which
delegates to INSTANCE.convertValue) so future readers know the delegation to
INSTANCE.convertValue and any exceptions it may propagate.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/AutoConvertFunction.java`:
- Around line 13-18: Add JavaDoc comments for the public constructor
AutoConvertFunction() and the public static entrypoint convert(Object value):
document the parameters with `@param`, the return value with `@return`, and any
possible exceptions thrown with `@throws` (e.g., if convertValue can throw
IllegalArgumentException or other runtime exceptions) following the project's
JavaDoc guidelines; place the JavaDoc immediately above the
AutoConvertFunction() constructor and above the convert(Object) method and
ensure wording describes purpose, parameter semantics, return type, and
conditions that lead to exceptions.

In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/MemkConvertFunction.java`:
- Around line 13-18: Add JavaDoc for the public constructor
MemkConvertFunction() and the public static entrypoint convert(Object value):
document the constructor and the convert method with a brief description,
include `@param` for "value", `@return` describing the returned Object (and its
semantics), and `@throws` for any exceptions the method may propagate (e.g.,
IllegalArgumentException or any specific runtime exceptions thrown by
INSTANCE.convertValue). Place the JavaDoc immediately above the constructor and
the convert method declarations so they comply with the project's "Public
methods MUST have JavaDoc" rule.

In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/RmcommaConvertFunction.java`:
- Around line 13-18: Add JavaDoc for the public RmcommaConvertFunction()
constructor and the public static convert(Object value) entrypoint: document
parameters with `@param` (describe "value"), describe the return value with
`@return`, and declare any runtime exceptions thrown (or document that none are
thrown) with `@throws`; reference the class RmcommaConvertFunction, the
constructor RmcommaConvertFunction(), and the convert(Object) method (which
delegates to INSTANCE.convertValue) so future readers know the delegation to
INSTANCE.convertValue and any exceptions it may propagate.

In `@ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java`:
- Around line 1225-1244: The code validates functionName using toLowerCase()
against SUPPORTED_CONVERSION_FUNCTIONS but still uses the raw functionName when
building the AST; normalize the name once (e.g., lowercased normalizedName)
after reading funcCtx.functionName.getText(), use normalizedName for the
contains check, for the "none" branch comparison, and pass normalizedName into
AstDSL.function(...) when constructing the Let node (leave usage of
internalVisitExpression and determineTargetField unchanged).
- Around line 1238-1242: The current "none" branch in AstBuilder uses a fragile
toString() comparison between fieldArg and targetField.getField() and thus emits
new Let(...) even for no-ops; change the check to detect a true no-op by
inspecting fieldArg's alias instead: in the if
("none".equalsIgnoreCase(functionName)) branch, return null when fieldArg has an
alias that matches the target field (i.e., detect alias presence on fieldArg and
compare it to the targetField.getField() identity/name) and only create new
Let(targetField, fieldArg) when that alias-based no-op check fails.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3293010 and ac93662.

📒 Files selected for processing (7)

core/src/main/java/org/opensearch/sql/expression/function/udf/AutoConvertFunction.java
core/src/main/java/org/opensearch/sql/expression/function/udf/MemkConvertFunction.java
core/src/main/java/org/opensearch/sql/expression/function/udf/NumConvertFunction.java
core/src/main/java/org/opensearch/sql/expression/function/udf/RmcommaConvertFunction.java
core/src/main/java/org/opensearch/sql/expression/function/udf/RmunitConvertFunction.java
docs/user/ppl/cmd/convert.md
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java

🚧 Files skipped from review as they are similar to previous changes (3)

core/src/main/java/org/opensearch/sql/expression/function/udf/NumConvertFunction.java
docs/user/ppl/cmd/convert.md
core/src/main/java/org/opensearch/sql/expression/function/udf/RmunitConvertFunction.java

penghuo

Thanks!

Please fix merge conflict.

Signed-off-by: ritvibhatt <53196324+ritvibhatt@users.noreply.github.com>

github-actions · 2026-02-27T23:19:04Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes Sub-PR theme: Add UDF conversion functions (auto, num, rmcomma, rmunit, memk) Relevant files: core/src/main/java/org/opensearch/sql/expression/function/udf/BaseConversionUDF.java core/src/main/java/org/opensearch/sql/expression/function/udf/AutoConvertFunction.java core/src/main/java/org/opensearch/sql/expression/function/udf/NumConvertFunction.java core/src/main/java/org/opensearch/sql/expression/function/udf/RmcommaConvertFunction.java core/src/main/java/org/opensearch/sql/expression/function/udf/RmunitConvertFunction.java core/src/main/java/org/opensearch/sql/expression/function/udf/MemkConvertFunction.java core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java core/src/main/java/org/opensearch/sql/expression/function/BuiltinFunctionName.java core/src/test/java/org/opensearch/sql/expression/function/udf/ConversionFunctionsTest.java Sub-PR theme: Implement PPL convert command parser, AST, and Calcite visitor Relevant files: ppl/src/main/antlr/OpenSearchPPLParser.g4 ppl/src/main/antlr/OpenSearchPPLLexer.g4 core/src/main/java/org/opensearch/sql/ast/tree/Convert.java core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java core/src/main/java/org/opensearch/sql/analysis/Analyzer.java core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.java ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteConvertCommandIT.java integ-test/src/test/java/org/opensearch/sql/ppl/ConvertCommandIT.java docs/user/ppl/cmd/convert.md docs/user/ppl/index.md
⚡ Recommended focus areas for review Incorrect Pattern The `MEMK_PATTERN` regex `^([+-]?\d+\.?\d)([kmgKMG])?$` does not support decimal numbers with digits after the decimal point (e.g., "1.5g"). The pattern `\d+\.?\d` matches "1." but the `\d` after the dot is zero or more digits, which should work. However, the pattern does not support scientific notation for memk, which may be inconsistent. More critically, the pattern does not allow a leading dot (e.g., ".5g"), while `LEADING_NUMBER_WITH_UNIT_PATTERN` does allow `\.\\d+`. This inconsistency could cause unexpected null returns for valid inputs. private static final Pattern MEMK_PATTERN = Pattern.compile("^([+-]?\\d+\\.?\\d)([kmgKMG])?$"); Comma Handling Logic In `applyConversion`, when a string contains commas, `tryConvertWithCommaRemoval` is called which removes ALL commas and tries to parse. However, if the string also has a unit suffix (e.g., "1,234 kg"), the comma removal path is tried first and may succeed with an incorrect result (e.g., "1234 kg" fails to parse as double), then the leading number extraction is tried. The ordering of these checks may produce unexpected results for strings like "1,234.56 units" — the comma removal produces "1234.56 units" which fails, then leading number extraction on the original "1,234.56 units" may behave differently than expected. if (preprocessedValue.contains(",")) { result = tryConvertWithCommaRemoval(preprocessedValue); if (result != null) { return result; } } String leadingNumber = extractLeadingNumber(preprocessedValue); if (hasValidUnitSuffix(preprocessedValue, leadingNumber)) { return tryParseDouble(leadingNumber); } return null; Null Filtering In `visitConvertCommand`, conversions that return `null` (from `buildConversion` when `none()` is used without an alias on the same field) are silently filtered out. This means a `Convert` node could be created with an empty `conversions` list if all conversions are `none()` without aliases. While `visitConvert` handles empty conversions gracefully, this silent filtering may hide user errors or unexpected behavior. ctx.convertFunction().stream() .map(this::buildConversion) .filter(conversion -> conversion != null) .collect(Collectors.toList()); return new Convert(conversions); Anonymization Bug In `visitConvert`, when the expression is not a `Function` (e.g., for `none()` conversions that are represented as `Field` in the AST after the `buildConversion` logic), `functionName` remains an empty string and `actualSourceField` remains empty. This results in output like `"(identifier) AS identifier"` for `none()` with alias, which is inconsistent with the expected format. The anonymizer test confirms this behavior but it looks like a formatting issue in the anonymized output. String functionName = ""; String fields = MASK_COLUMN; String actualSourceField = ""; if (conversion.getExpression() instanceof Function) { Function func = (Function) conversion.getExpression(); functionName = func.getFuncName().toLowerCase(Locale.ROOT); if (!func.getFuncArgs().isEmpty() && func.getFuncArgs().get(0) instanceof Field) { actualSourceField = ((Field) func.getFuncArgs().get(0)).getField().toString(); } fields = func.getFuncArgs().stream() .map(arg -> MASK_COLUMN) .collect(Collectors.joining(",")); } String targetField = conversion.getVar().getField().toString(); String asClause = !targetField.equals(actualSourceField) ? " AS " + MASK_COLUMN : ""; return StringUtils.format("%s(%s)%s", functionName, fields, asClause); }) Unused Enums Several enum values are added (`CTIME`, `MKTIME`, `DUR2SEC`, `MSTIME`, `NONE`) that are not implemented or registered as operators in this PR. These appear to be placeholders for future work but are not documented as such, and `NONE` in particular is referenced in the grammar but not registered as an operator (which is intentional since `none()` is a no-op, but could be confusing). AUTO(FunctionName.of("auto")), NUM(FunctionName.of("num")), CTIME(FunctionName.of("ctime")), MKTIME(FunctionName.of("mktime")), DUR2SEC(FunctionName.of("dur2sec")), MEMK(FunctionName.of("memk")), MSTIME(FunctionName.of("mstime")), RMUNIT(FunctionName.of("rmunit")), RMCOMMA(FunctionName.of("rmcomma")), NONE(FunctionName.of("none")),

github-actions · 2026-02-27T23:19:53Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Fix incorrect duplicate field detection for alias conversions The `seenFields` set tracks source fields to prevent duplicate conversions, but in `processFieldCopyConversion` (used for `none()` with alias), the `source` field is added to `seenFields`. This means if a user writes `convert none(balance) AS balance_copy,` `auto(balance)`, the `auto(balance)` conversion will be incorrectly rejected with "Field 'balance' cannot be converted more than once", even though the original field is not being overwritten. The `seenFields` check should only apply when the source field is being replaced (i.e., `target.equals(source)`). core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [1024-1038] private void processFieldCopyConversion( String target, Field field, ConversionState state, CalcitePlanContext context) { String source = field.getField().toString(); - - if (state.seenFields.contains(source)) { - throw new SemanticCheckException( - String.format("Field '%s' cannot be converted more than once", source)); - } - state.seenFields.add(source); if (!target.equals(source)) { RexNode sourceField = context.relBuilder.field(source); state.additions.add(Pair.of(target, context.relBuilder.alias(sourceField, target))); } + // Only track as "seen" when overwriting the source field } Suggestion importance[1-10]: 5 __ Why: The suggestion identifies a real logical issue where `seenFields` tracking in `processFieldCopyConversion` could incorrectly block valid conversions like `convert none(balance) AS balance_copy, auto(balance)`. However, the improved code removes the `seenFields` tracking entirely for copy conversions, which may allow other edge cases.	Low
Possible issue	Fix regex pattern to support leading plus sign The `MEMK_PATTERN` only matches digits without a sign prefix (`^([+-]?\d+\.?\d)([kmgKMG])?$`), but the test `testMemkConvert` expects `MemkConvertFunction.convert("+100")` to return `100.0` and `MemkConvertFunction.convert("+50m")` to return `51200.0`. The pattern `\d+` does not match a leading `+` sign, so these cases will return `null` instead of the expected values. The pattern should be updated to allow an optional leading sign.* core/src/main/java/org/opensearch/sql/expression/function/udf/BaseConversionUDF.java [34] -protected Double tryConvertMemoryUnit(String str) { - Matcher matcher = MEMK_PATTERN.matcher(str); - if (!matcher.matches()) { - return null; - } +private static final Pattern MEMK_PATTERN = Pattern.compile("^([+-]?\\d+\\.?\\d)([kmgKMG])?$"); - Double number = tryParseDouble(matcher.group(1)); - if (number == null) { - return null; - } - - String unit = matcher.group(2); - if (unit == null \|\| unit.equalsIgnoreCase("k")) { - return number; - } - - double multiplier = - switch (unit.toLowerCase()) { - case "m" -> MB_TO_KB; - case "g" -> GB_TO_KB; - default -> 1.0; - }; - - return number multiplier; -} - Suggestion importance[1-10]: 2 __ Why: The `MEMK_PATTERN` already includes `[+-]?` in the pattern (`^([+-]?\\d+\\.?\\d*)([kmgKMG])?$`), so the suggestion's claim that `+` is not supported is incorrect. The 'improved_code' is identical to the existing code, making this suggestion invalid.	Low
General	Avoid relying on package-private instance field `AutoConvertFunction` accesses `NumConvertFunction.INSTANCE` which has package-private visibility (`static final NumConvertFunction INSTANCE`). While this works within the same package, it creates a tight coupling and bypasses the public API. The `auto()` function should delegate to `NumConvertFunction.convert()` or use its own logic to avoid relying on package-private state. core/src/main/java/org/opensearch/sql/expression/function/udf/AutoConvertFunction.java [22-29] @Override protected Object applyConversion(String preprocessedValue) { Double result = tryConvertMemoryUnit(preprocessedValue); if (result != null) { return result; } - return NumConvertFunction.INSTANCE.applyConversion(preprocessedValue); + return new NumConvertFunction().applyConversion(preprocessedValue); } Suggestion importance[1-10]: 3 __ Why: The suggestion correctly identifies that `NumConvertFunction.INSTANCE` has package-private visibility, creating tight coupling. However, the proposed fix of creating a `new NumConvertFunction()` each time is not ideal either; a better approach would be to make `INSTANCE` public or use the static `convert` method.	Low
General	Log unexpected non-numeric conversion results The `toDoubleOrNull` method silently returns `null` for non-`Number` objects, but the `convert` static methods in subclasses (e.g., `AutoConvertFunction.convert`) can return non-`Number` objects (e.g., a `String`). This means if `applyConversion` accidentally returns a non-numeric type, the error will be silently swallowed and `null` will be returned instead of surfacing the bug. Consider adding a log warning or assertion for unexpected non-null, non-Number return values. core/src/main/java/org/opensearch/sql/expression/function/udf/BaseConversionUDF.java [181-186] public static Double toDoubleOrNull(Object value) { + if (value == null) { + return null; + } if (value instanceof Number) { return ((Number) value).doubleValue(); } + log.warn("Unexpected non-numeric value returned from conversion function: {}", value.getClass()); return null; } Suggestion importance[1-10]: 2 __ Why: This is a minor defensive logging suggestion. The subclass `applyConversion` methods are designed to return `Object` (either `Double` or `null`), and the `toDoubleOrNull` method is a final safety net. Adding a warning log is a minor improvement but has low impact on correctness.	Low

aaarone90 added 30 commits February 18, 2026 14:19

Fixing integration tests

d5606bf

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Fixing ymal files for IT explain tests

67ddf4f

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Fix cross-cluster IT failure

3cb1fa3

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Making code more readable and removing unnecessary logic

66a23d7

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Refactor: Extract BaseConversionUDF to eliminate duplication

b0781cd

Centralized common conversion function logic into BaseConversionUDF base class, reducing code from 219 to 162 lines (26% reduction) while maintaining all functionality. Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Fixing CI failure and refactoring

6ff9c2a

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

trigger CI

2f24dba

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Addressing CodeRabbit comments

6763d8f

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Adding support for memk function

4e89f18

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Fixing formatting

20f0b16

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Updating documentation

a5a672b

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Refactoring code to avoid regestering none() as a convert function an…

1bc2c61

…d separating concerns into each respective function class Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

refactor: Simplify Convert command using Let expressions

a8020f8

Replace ConvertFunction class with standard Let expressions to handle none() at AST level and reduce code complexity. Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Trigger CI

c77fdfb

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Refactoring code to use Template Method Design Pattern

4803558

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Updating documentation

a2db4e2

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Trigger CI

0daf0a5

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Updating convert example with stats

9f69d4d

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Renaming unit test class, as ConversionUtil class was removed

8658396

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Fixing IT test case

bd13675

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Resolve merge conflict: Add testTransposeCommand alongside testConver…

af71278

…tCommand Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Adding edge test cases, as recommended by Coderabbit

841413d

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Trigger CI

638c1be

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Removing timeformat parameter for now, will add later

7d5f959

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Trigger CI

9f61b68

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

Re-trigger CI

8aa1c61

Signed-off-by: Aaron Alvarez <aaarone@amazon.com>

ritvibhatt requested review from GumpacG, LantaoJin, MaxKsyunz, RyanL1997, Swiddis, Yury-Fridlyand, acarbonetto, anirudha, forestmvey, noCharger, penghuo, qianheng-aws, seankao-az, ykmr1224 and yuancu as code owners February 19, 2026 17:29

coderabbitai bot reviewed Feb 19, 2026

View reviewed changes

Merge branch 'main' into ppl-convert

2733ce7

Signed-off-by: ritvibhatt <53196324+ritvibhatt@users.noreply.github.com>

coderabbitai bot reviewed Feb 21, 2026

View reviewed changes

ritvibhatt added 2 commits February 20, 2026 17:18

fix merge conflict

48312d0

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

fix merge conflicts

3293010

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

penghuo added PPL Piped processing language feature labels Feb 23, 2026

penghuo reviewed Feb 23, 2026

View reviewed changes

use singleton for udf instance

ac93662

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

coderabbitai bot reviewed Feb 24, 2026

View reviewed changes

penghuo previously approved these changes Feb 26, 2026

View reviewed changes

Merge branch 'main' into ppl-convert

cbe577b

Signed-off-by: ritvibhatt <53196324+ritvibhatt@users.noreply.github.com>

ritvibhatt dismissed penghuo’s stale review via cbe577b February 27, 2026 23:17

		verifySchema(result, schema("balance_num", "double"));
		public void testCrossClusterMvExpandBasic() throws IOException {

Conversation

ritvibhatt commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Check List

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

penghuo Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

ritvibhatt Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

penghuo Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

ritvibhatt Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

penghuo Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

ritvibhatt Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

penghuo left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 27, 2026

ritvibhatt commented Feb 19, 2026 •

edited

Loading