feat: Implement PPL convert command with 5 conversion functions#5157
feat: Implement PPL convert command with 5 conversion functions#5157ritvibhatt wants to merge 37 commits intoopensearch-project:mainfrom
Conversation
- Added convert command syntax and AST nodes (Convert, ConvertFunction) - Implemented 5 conversion functions: auto, num, rmcomma, rmunit, none - Full Calcite pushdown support via CalciteRelNodeVisitor - Logical and physical operators (LogicalConvert, ConvertOperator) - Comprehensive test coverage (26 tests total): * 13 unit tests (CalcitePPLConvertTest) * 8 integration tests with pushdown (ConvertCommandIT) * 8 non-pushdown tests (CalciteConvertCommandIT) * 3 explain tests (ExplainIT) * 2 cross-cluster tests (CrossClusterSearchIT) * Anonymizer and v2 unsupported tests - User documentation (docs/user/ppl/cmd/convert.md) - Code cleanup: removed dead code, simplified javadocs - Version: 3.5 (experimental) All tests compile and pass successfully. Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Centralized common conversion function logic into BaseConversionUDF base class, reducing code from 219 to 162 lines (26% reduction) while maintaining all functionality. Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
…d separating concerns into each respective function class Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Replace ConvertFunction class with standard Let expressions to handle none() at AST level and reduce code complexity. Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
…tCommand Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Add testNumConvertNullAndEmpty() to ensure NumConvertFunction.convert() properly returns null for null, empty string, and whitespace-only inputs, matching the coverage pattern of other conversion functions. Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Add testRmunitConvertNullAndEmpty() to ensure RmunitConvertFunction.convert() properly returns null for null, empty string, and whitespace-only inputs, completing the test coverage pattern across all conversion functions. Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Update testConvertWithAlias to verify that using AS clause preserves the original field alongside the converted field. Test now checks both balance (bigint) and balance_num (double) are present in results. Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
There was a problem hiding this comment.
Actionable comments posted: 17
🧹 Nitpick comments (9)
integ-test/src/test/java/org/opensearch/sql/security/CrossClusterSearchIT.java (1)
217-222: Assertions only verify column presence — consider asserting converted values.Both tests call only
verifyColumn(result, columnName(...)), which confirms the output column exists but does not validate thatauto(balance)actually produced correct converted values. Compare with the existingtestCrossClusterSortWithTypeCasting(line 162), which usesverifyDataRowsto assert specific numeric values after a similar conversion. A column-name-only check would pass even if the conversion silently returned nulls or wrong types for every row.Consider adding a
verifyDataRowscall with representative expected balance values (the same 7 rows already expected intestCrossClusterSortWithTypeCasting), or at minimum assert that the result is non-empty and the converted column is of numeric type.As per coding guidelines, "Check test assertions are meaningful and specific."
Also applies to: 231-236
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integ-test/src/test/java/org/opensearch/sql/security/CrossClusterSearchIT.java` around lines 217 - 222, Current assertions in CrossClusterSearchIT call verifyColumn(result, columnName("balance")) after executeQuery(...) which only checks column presence; update the tests to assert actual converted numeric values by invoking verifyDataRows(result, expectedRows...) using the same 7 expected balance values used in testCrossClusterSortWithTypeCasting, or at minimum add assertions that the result is non-empty and that each value in the converted "balance" column is numeric; locate the call sites around executeQuery/String.format(...) with TEST_INDEX_BANK_REMOTE and replace or extend the verifyColumn(...) checks with verifyDataRows(...) (or additional numeric-type/non-empty assertions) to make the test meaningful.integ-test/src/test/java/org/opensearch/sql/security/CalciteCrossClusterSearchIT.java (1)
450-459: Consider adding averifyDataRowscall for a more specific assertion.Both new tests assert only column presence and schema type, which is consistent with the existing pattern in this class (e.g.,
testCrossClusterFieldsSpaceDelimited). However, the coding guidelines ask that integration test assertions be "meaningful and specific." Adding at least one row-value check would strengthen confidence that actual values are being converted correctly across the cluster boundary, not just that the projected type is correct.💡 Optional: add row-value assertion
`@Test` public void testCrossClusterConvertWithAlias() throws IOException { JSONObject result = executeQuery( String.format( - "search source=%s | convert auto(balance) AS balance_num | fields balance_num", + "search source=%s | where firstname='Hattie' | convert auto(balance) AS balance_num | fields balance_num", TEST_INDEX_BANK_REMOTE)); verifyColumn(result, columnName("balance_num")); verifySchema(result, schema("balance_num", "double")); + verifyDataRows(result, rows(5686.0)); }The same pattern could apply to
testCrossClusterConvert.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integ-test/src/test/java/org/opensearch/sql/security/CalciteCrossClusterSearchIT.java` around lines 450 - 459, Add a concrete row-value assertion to testCrossClusterConvertWithAlias: after the existing verifyColumn(result, columnName("balance_num")) and verifySchema(result, schema("balance_num", "double")), call verifyDataRows(result, rows("balance_num", "<expected_double>")) to assert at least one converted numeric value is correct (replace <expected_double> with the known balance value from TEST_INDEX_BANK_REMOTE as a double); reference symbols: testCrossClusterConvertWithAlias, executeQuery, verifyColumn, verifySchema, verifyDataRows.ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.java (1)
1-272: Missing negative test cases required by PPL guidelinesThis file path falls under
**/ppl/**/*.java, which mandates verifying grammar tests with positive and negative cases. The entire suite is positive-path only. There are no tests verifying parser/planner behavior for:
- An unrecognised conversion function name (e.g.,
convert unknown_fn(SAL))- A missing argument (e.g.,
convert auto())- A non-existent source field (e.g.,
convert auto(NONEXISTENT_COL))none()called with an alias that shadows an existing columnAt minimum, an exception/error-propagation test for an invalid function name should be added to confirm the parser rejects it cleanly.
As per coding guidelines,
For PPL parser changes, verify grammar tests with positive/negative cases.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.java` around lines 1 - 272, The test suite in CalcitePPLConvertTest only covers positive convert cases; add negative grammar tests to validate error handling (at minimum a test that asserts an invalid conversion function is rejected). In CalcitePPLConvertTest add a new `@Test` method (e.g., testConvertInvalidFunction) that calls getRelNode("source=EMP | convert unknown_fn(SAL)") and asserts an exception is thrown (use assertThrows or the existing test harness), and similarly add tests for missing argument ("convert auto()"), non-existent field ("convert auto(NONEXISTENT_COL)"), and alias shadowing with none() (e.g., "convert none(EMPNO) AS EMPNO") to verify the parser/planner raises/propagates errors instead of succeeding; use getRelNode and the existing verification helpers (verifyLogical/verifyPPLToSparkSQL) pattern for positive vs negative assertions.core/src/main/java/org/opensearch/sql/expression/function/udf/BaseConversionUDF.java (1)
25-53: Missing JavaDoc on public API methods.Per coding guidelines, public methods in
core/src/main/java/**/*.javamust have JavaDoc with@param,@return, and@throws. The public methodsgetReturnTypeInference(),getOperandMetadata(), andconvertValue()(which is the core template method) lack documentation.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/org/opensearch/sql/expression/function/udf/BaseConversionUDF.java` around lines 25 - 53, Add JavaDoc for the public API methods: add Javadoc comments to BaseConversionUDF#getReturnTypeInference() and BaseConversionUDF#getOperandMetadata() including a brief description and appropriate `@return` tags; also locate the core template method convertValue(...) (in this class or ConversionImplementor) and add Javadoc describing its purpose, parameters (`@param` for each argument), return value (`@return`) and any exceptions it can throw (`@throws`). Ensure the Javadoc text follows project conventions and documents nullability/behavior where relevant.core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java (1)
425-427: Optional: Add JavaDoc tovisitConvert.All other visitor method stubs in this class lack JavaDoc; this new method follows that same pattern. However, the
core/src/main/java/**/*.javaguideline requires public methods to have JavaDoc with@param,@return, and@throws. Consider adding it here (and ideally addressing the systemic gap in a follow-up).📝 Suggested JavaDoc
+ /** + * Visit Convert node. + * + * `@param` node {`@link` Convert} + * `@param` context Context + * `@return` Return Type. + */ public T visitConvert(Convert node, C context) { return visitChildren(node, context); }As per coding guidelines: "Public methods MUST have JavaDoc with
@param,@return, and@throws" (applicable tocore/src/main/java/**/*.java).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java` around lines 425 - 427, Add JavaDoc to the public method visitConvert in AbstractNodeVisitor: document the method purpose and include `@param` for the Convert node and context (C context), `@return` describing the returned generic T, and `@throws` for any runtime exceptions that may be propagated; ensure the JavaDoc style matches other methods in AbstractNodeVisitor and complies with the core/src/main/java/**/*.java requirement for public methods to have `@param`, `@return`, and `@throws` tags.ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java (1)
1046-1062:memk()conversion function has no anonymizer test coverage.All other conversion functions (
auto,num,rmcomma,rmunit,none) have at least one assertion intestConvertCommand, butmemkis absent. Add a case such as:+ assertEquals( + "source=table | convert memk(identifier)", + anonymize("source=t | convert memk(memory)"));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java` around lines 1046 - 1062, The testConvertCommand in PPLQueryDataAnonymizerTest is missing coverage for the memk() conversion; add an assertion that calls anonymize with an input using memk (e.g., "source=t | convert memk(field)") and expects the anonymized output to map the field to the generic identifier form consistent with other cases (e.g., "source=table | convert memk(identifier)" or with AS when appropriate). Update the testConvertCommand method to include this new assertEquals using the existing anonymize helper so memk conversion is validated alongside auto, num, rmcomma, rmunit, and none.core/src/main/java/org/opensearch/sql/ast/tree/Convert.java (1)
19-26: Avoid public setters to keep the AST node immutable.
@Setterexposes mutable state outsideattach, which weakens immutability guarantees for AST nodes. Consider removing the class-level@Setterto keep mutation scoped.As per coding guidelines: `core/src/main/java/org/opensearch/sql/ast/**/*.java`: AST nodes must be immutable where possible.♻️ Proposed change
-@Setter `@ToString` `@EqualsAndHashCode`(callSuper = false) `@RequiredArgsConstructor` public class Convert extends UnresolvedPlan {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/org/opensearch/sql/ast/tree/Convert.java` around lines 19 - 26, Remove the class-level `@Setter` to prevent broad mutation of the AST node; keep `@Getter/`@ToString/@EqualsAndHashCode/@RequiredArgsConstructor. Replace the broad setter by adding a narrowly scoped mutator (e.g., a package-private method like attach(UnresolvedPlan child) or attachChild(UnresolvedPlan child)) that sets the private field child so only callers in the same package (or the intended attach logic) can modify it; leave conversions as final and immutable via the existing field.core/src/test/java/org/opensearch/sql/expression/function/udf/ConversionFunctionsTest.java (2)
316-327:RmcommaConvertFunctionreturnsnullfor all scientific-notation extreme values.Lines 318–324 assert
nullfor strings like"1.7976931348623157E308"and"1e309"because theE/echaracter triggerscontainsLetter(). This meansrmcommasilently rejects any scientific-notation input. If this is intentional, a brief comment in the test explaining why would help future maintainers —containsLettertreatseas alphabetic, so all exponential representations are rejected.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/org/opensearch/sql/expression/function/udf/ConversionFunctionsTest.java` around lines 316 - 327, The test currently asserts null for scientific-notation strings because RmcommaConvertFunction.convert uses containsLetter(), which treats 'e'/'E' as alphabetic and thus rejects exponential notation; update the test method testRmcommaConvertNumericExtremes to include a brief inline comment (near the assertions referencing RmcommaConvertFunction.convert) stating that scientific/exponential formats are intentionally rejected due to containsLetter() treating 'e' as a letter, so these inputs return null, to help future maintainers understand the behavior.
228-236: Special-value tests (Infinity,NaN,∞) only coverNumandAuto.
RmunitConvertFunction,RmcommaConvertFunction, andMemkConvertFunctionare not tested against these special value strings. Per guidelines, error condition tests should be included for all new functions. Consider extending this test (or adding a sibling) to assert the expected behavior for the remaining three functions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/test/java/org/opensearch/sql/expression/function/udf/ConversionFunctionsTest.java` around lines 228 - 236, Extend the special-value test so RmunitConvertFunction, RmcommaConvertFunction, and MemkConvertFunction are asserted to behave the same as NumConvertFunction/AutoConvertFunction for the strings "∞", "Infinity", and "NaN"; update ConversionFunctionsTest.testSpecialValues (or add a sibling test) to call RmunitConvertFunction.convert(...), RmcommaConvertFunction.convert(...), and MemkConvertFunction.convert(...) for each special string and assertNull on each result to cover all new functions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@core/src/main/java/org/opensearch/sql/expression/function/BuiltinFunctionName.java`:
- Around line 303-313: The enum entries CTIME, MKTIME, DUR2SEC, and MSTIME in
BuiltinFunctionName are declared but have no implementations or registrations
and will fail at runtime; either implement corresponding ConvertFunction classes
and register them in PPLFuncImpTable (similar to AUTO/NUM/MEMK/RMUNIT/RMCOMMA)
and add unit tests, or remove these four enum entries from BuiltinFunctionName
to avoid exposing unimplemented functions (note NONE is handled as a
special-case in AstBuilder.java and can remain). Ensure any new implementations
follow the existing ConvertFunction pattern and are included in test coverage
and the function dispatch table.
In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/AutoConvertFunction.java`:
- Around line 11-17: Add JavaDoc comments for the public AutoConvertFunction()
constructor and the public static convert(Object value) entrypoint: for
AutoConvertFunction(), add a short description and an `@throws` tag if the
constructor can propagate any runtime exceptions (or explicitly state none are
thrown); for convert(Object) add a description, an `@param` value explaining
accepted input, an `@return` describing the conversion result, and an `@throws`
documenting the runtime exception(s) that may be raised (e.g., if
convertValue(value) fails). Reference the constructor AutoConvertFunction(), the
static method convert(Object), and the instance helper convertValue(Object) in
the comments so maintainers can trace behavior.
In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/MemkConvertFunction.java`:
- Around line 11-17: Add JavaDoc comments for the public MemkConvertFunction()
constructor and the public static convert(Object value) entrypoint: for
MemkConvertFunction() document its purpose and any side effects and include
`@throws` if it can throw (or explicitly state none), and for convert(Object
value) document the parameter (`@param` value the input to convert), the return
value (`@return` the converted object or null) and any exceptions it can propagate
(`@throws` ... or state none); reference the convertValue(Object) helper in the
convert method description to clarify behavior.
In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/NumConvertFunction.java`:
- Around line 11-17: Add JavaDoc for the public API: document the
NumConvertFunction() constructor and the static method convert(Object value)
with `@param` describing the value argument, `@return` describing the returned
converted numeric/object result, and `@throws` for any runtime exceptions that
convertValue may propagate; reference the NumConvertFunction.convert(Object)
method and the NumConvertFunction() constructor and ensure the JavaDoc text
matches coding guidelines for public methods.
In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/RmcommaConvertFunction.java`:
- Around line 11-17: Add JavaDoc to the public no-arg constructor
RmcommaConvertFunction() and the static entrypoint convert(Object value): for
the constructor, add a short description of the class initialization and include
any `@throws` if construction can fail; for the static convert method, add a
description, an `@param` value describing accepted input types, an `@return`
describing the converted value (or null) and an `@throws` documenting the
exceptions that may be raised during conversion (e.g., IllegalArgumentException
for invalid input or RuntimeException for conversion errors). Reference the
convertValue(Object) call inside convert(Object) so the JavaDoc reflects that
the static method delegates to convertValue for the actual conversion.
In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/RmunitConvertFunction.java`:
- Around line 11-17: Add JavaDoc comments for the public constructor
RmunitConvertFunction() and the static entrypoint method convert(Object) that
document parameters, return value and possible exceptions; for the constructor
include a brief description and note any thrown runtime exceptions if
applicable, and for convert(Object) add `@param` value describing accepted inputs,
`@return` describing the returned converted object (or null) and `@throws` entries
for any runtime exceptions propagated by convertValue(Object) so the JavaDoc
satisfies the project's `@param/`@return/@throws requirement while referencing the
convertValue(Object) behavior.
In
`@core/src/test/java/org/opensearch/sql/expression/function/udf/ConversionFunctionsTest.java`:
- Around line 175-215: Add missing zero-boundary and numeric-extreme test cases
for MemkConvertFunction in ConversionFunctionsTest: create a test method (e.g.,
testMemkConvertZeroBoundaries) that asserts convert("0"), convert("0k"),
convert("0m"), convert("0g"), convert("+0") and convert("-0") return 0.0 (or
-0.0 as appropriate), and another method (e.g., testMemkConvertNumericExtremes)
that asserts convert(String.valueOf(Double.MAX_VALUE)) returns Double.MAX_VALUE
and that overflow/invalid huge values like "1e309" and "1e309k" return null; use
MemkConvertFunction.convert references to locate where to add these tests.
In `@docs/user/ppl/cmd/convert.md`:
- Around line 84-100: The examples use non-existent indices
(source=system_metrics in the convert example and source=metrics in the other
example) so update both examples to use the existing test index source=accounts
and simulate the intended fields with eval; locate the convert example
containing the pipeline with "source=system_metrics | eval memory='100m' |
convert memk(memory)" and the other example referencing "source=metrics" and
replace their source clauses with "source=accounts" and add an eval that creates
the memory/metric field values (e.g., eval memory='100m' or eval
<field>=<value>) so the examples run against the standard test data while
keeping the rest of the pipeline unchanged.
- Around line 60-78: The example's displayed row count is wrong for the query
`source=accounts | eval price='1,234' | convert num(price) | fields price`: it
will produce one result row per document in the index (all with price=1234.0).
Either append `| head 1` to the example query to legitimately show "fetched rows
/ total rows = 1/1", or update the shown result block to reflect the full-row
counts (e.g., fetched rows / total rows = N/N) and keep the original query;
modify the example in convert.md accordingly so the query and displayed result
are consistent.
- Around line 237-240: Fix the markdown table closing fence by replacing the
trailing pipe character in the closing separator string
"+------------+-----------+----------|" with a plus so it becomes
"+------------+-----------+----------+"; look for the table block that contains
the rows with "6 | Hattie | Bond" and "13 | Nanette | Bates" and update the
final separator line accordingly to correct the table formatting.
- Around line 19-21: The docs currently list `*` as a supported value for the
`<field>` parameter in the convert command table, but wildcard support is not
implemented (see the "when wildcard support is implemented" comment later);
update the `<field>` row in the parameter table to either remove the `*` example
or explicitly mark it as "not yet supported" (e.g., change "Single field name to
convert, or `*` to convert all fields." to "Single field name to convert;
wildcard `*` not yet supported."). Ensure the text mentions `*` and `<field>` so
reviewers can find and verify the change, and keep the rest of the table
unchanged.
In `@integ-test/src/test/java/org/opensearch/sql/ppl/ConvertCommandIT.java`:
- Around line 72-75: The test method verifyQueryThrowsCalciteError uses
assertThrows but the JUnit 5 static import is missing; add a static import for
org.junit.jupiter.api.Assertions.assertThrows at the top of the test file so the
call in verifyQueryThrowsCalciteError(...) resolves correctly when using JUnit 5
`@Test`.
In `@integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java`:
- Around line 229-238: The test testConvertCommand currently only calls
verifyQuery(result) in the ResponseException catch path, so when executeQuery
succeeds (Calcite enabled) nothing is asserted; change the flow to capture the
successful response into the existing JSONObject result inside the try (assign
the executeQuery return into result) and move a single verifyQuery(result) call
outside the try/catch so it runs for both success and failure; keep the catch
building result from TestUtils.getResponseBody(e.getResponse()) as it does now
so verifyQuery sees the error response when an exception occurs.
In
`@integ-test/src/test/java/org/opensearch/sql/security/CrossClusterSearchIT.java`:
- Around line 213-239: Both testCrossClusterConvert and
testCrossClusterConvertWithAlias enable Calcite with enableCalcite() but call
disableCalcite() only on the happy path; wrap the body of each test in a try {
... } finally { disableCalcite(); } so disableCalcite() always runs even if
executeQuery(...) or verifyColumn(...) throws; locate the two test methods
(testCrossClusterConvert and testCrossClusterConvertWithAlias) and move their
enableCalcite() to before the try and ensure the executeQuery/verifyColumn calls
are inside the try block with disableCalcite() in the finally.
In `@ppl/src/main/antlr/OpenSearchPPLLexer.g4`:
- Line 163: The TIMEFORMAT lexer token is unused; either remove the TIMEFORMAT
token definition from OpenSearchPPLLexer.g4 to eliminate dead code, or if it's
intended, add corresponding parser grammar rules and/or Java references that
consume TIMEFORMAT (e.g., incorporate it into relevant parser rules that parse
time formats or make the token part of an existing rule), then regenerate the
lexer/parser and update any tests; search for TIMEFORMAT in the project to
confirm no existing usage before deleting or integrate it where parser rules
expect a time format token.
In `@ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.java`:
- Around line 257-271: The test method testConvertAutoWithMemoryField is
misnamed because it uses auto(JOB) aliased as memory_size while JOB is a string
job-title field; rename the test to reflect aliasing behavior (e.g.,
testConvertAutoAliasOnStringField or testConvertAutoWithAlias) or change the
input to use a memory-like field (e.g., replace JOB with a numeric/string
memory_field that contains values like "10k") so the name matches behavior;
update the method name testConvertAutoWithMemoryField and any associated
expected strings (retain auto(JOB) and memory_size if renaming, or replace
auto(JOB)/memory_size occurrences if you change the field) to keep assertions
consistent.
- Around line 1-272: The test class CalcitePPLConvertTest lacks required
NULL-input, boundary-condition, and error-condition tests for the five
conversion functions (auto, num, rmcomma, rmunit, memk); add unit tests that
exercise each function with (1) a nullable field (NULL-input) e.g., use COMM or
another nullable column and assert getRelNode(ppl) + verifyLogical(root, ...)
and verifyPPLToSparkSQL(root, ...), (2) a boundary/empty input case (e.g.,
rmcomma applied to fields that can be empty or empty-string equivalent) and
assert plan/SQL, and (3) an error-condition test that passes an invalid function
or argument and expects a parse/validation exception (use `@Test`(expected =
SomeParseException.class) around getRelNode(ppl)); implement these tests in
CalcitePPLConvertTest using the existing helpers getRelNode, verifyLogical, and
verifyPPLToSparkSQL and name them clearly like testConvertAutoWithNullInput,
testConvertNumBoundaryEmpty, testConvertInvalidFunctionThrows, etc.
---
Nitpick comments:
In `@core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java`:
- Around line 425-427: Add JavaDoc to the public method visitConvert in
AbstractNodeVisitor: document the method purpose and include `@param` for the
Convert node and context (C context), `@return` describing the returned generic T,
and `@throws` for any runtime exceptions that may be propagated; ensure the
JavaDoc style matches other methods in AbstractNodeVisitor and complies with the
core/src/main/java/**/*.java requirement for public methods to have `@param`,
`@return`, and `@throws` tags.
In `@core/src/main/java/org/opensearch/sql/ast/tree/Convert.java`:
- Around line 19-26: Remove the class-level `@Setter` to prevent broad mutation of
the AST node; keep
`@Getter/`@ToString/@EqualsAndHashCode/@RequiredArgsConstructor. Replace the broad
setter by adding a narrowly scoped mutator (e.g., a package-private method like
attach(UnresolvedPlan child) or attachChild(UnresolvedPlan child)) that sets the
private field child so only callers in the same package (or the intended attach
logic) can modify it; leave conversions as final and immutable via the existing
field.
In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/BaseConversionUDF.java`:
- Around line 25-53: Add JavaDoc for the public API methods: add Javadoc
comments to BaseConversionUDF#getReturnTypeInference() and
BaseConversionUDF#getOperandMetadata() including a brief description and
appropriate `@return` tags; also locate the core template method convertValue(...)
(in this class or ConversionImplementor) and add Javadoc describing its purpose,
parameters (`@param` for each argument), return value (`@return`) and any exceptions
it can throw (`@throws`). Ensure the Javadoc text follows project conventions and
documents nullability/behavior where relevant.
In
`@core/src/test/java/org/opensearch/sql/expression/function/udf/ConversionFunctionsTest.java`:
- Around line 316-327: The test currently asserts null for scientific-notation
strings because RmcommaConvertFunction.convert uses containsLetter(), which
treats 'e'/'E' as alphabetic and thus rejects exponential notation; update the
test method testRmcommaConvertNumericExtremes to include a brief inline comment
(near the assertions referencing RmcommaConvertFunction.convert) stating that
scientific/exponential formats are intentionally rejected due to
containsLetter() treating 'e' as a letter, so these inputs return null, to help
future maintainers understand the behavior.
- Around line 228-236: Extend the special-value test so RmunitConvertFunction,
RmcommaConvertFunction, and MemkConvertFunction are asserted to behave the same
as NumConvertFunction/AutoConvertFunction for the strings "∞", "Infinity", and
"NaN"; update ConversionFunctionsTest.testSpecialValues (or add a sibling test)
to call RmunitConvertFunction.convert(...), RmcommaConvertFunction.convert(...),
and MemkConvertFunction.convert(...) for each special string and assertNull on
each result to cover all new functions.
In
`@integ-test/src/test/java/org/opensearch/sql/security/CalciteCrossClusterSearchIT.java`:
- Around line 450-459: Add a concrete row-value assertion to
testCrossClusterConvertWithAlias: after the existing verifyColumn(result,
columnName("balance_num")) and verifySchema(result, schema("balance_num",
"double")), call verifyDataRows(result, rows("balance_num",
"<expected_double>")) to assert at least one converted numeric value is correct
(replace <expected_double> with the known balance value from
TEST_INDEX_BANK_REMOTE as a double); reference symbols:
testCrossClusterConvertWithAlias, executeQuery, verifyColumn, verifySchema,
verifyDataRows.
In
`@integ-test/src/test/java/org/opensearch/sql/security/CrossClusterSearchIT.java`:
- Around line 217-222: Current assertions in CrossClusterSearchIT call
verifyColumn(result, columnName("balance")) after executeQuery(...) which only
checks column presence; update the tests to assert actual converted numeric
values by invoking verifyDataRows(result, expectedRows...) using the same 7
expected balance values used in testCrossClusterSortWithTypeCasting, or at
minimum add assertions that the result is non-empty and that each value in the
converted "balance" column is numeric; locate the call sites around
executeQuery/String.format(...) with TEST_INDEX_BANK_REMOTE and replace or
extend the verifyColumn(...) checks with verifyDataRows(...) (or additional
numeric-type/non-empty assertions) to make the test meaningful.
In `@ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.java`:
- Around line 1-272: The test suite in CalcitePPLConvertTest only covers
positive convert cases; add negative grammar tests to validate error handling
(at minimum a test that asserts an invalid conversion function is rejected). In
CalcitePPLConvertTest add a new `@Test` method (e.g., testConvertInvalidFunction)
that calls getRelNode("source=EMP | convert unknown_fn(SAL)") and asserts an
exception is thrown (use assertThrows or the existing test harness), and
similarly add tests for missing argument ("convert auto()"), non-existent field
("convert auto(NONEXISTENT_COL)"), and alias shadowing with none() (e.g.,
"convert none(EMPNO) AS EMPNO") to verify the parser/planner raises/propagates
errors instead of succeeding; use getRelNode and the existing verification
helpers (verifyLogical/verifyPPLToSparkSQL) pattern for positive vs negative
assertions.
In
`@ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java`:
- Around line 1046-1062: The testConvertCommand in PPLQueryDataAnonymizerTest is
missing coverage for the memk() conversion; add an assertion that calls
anonymize with an input using memk (e.g., "source=t | convert memk(field)") and
expects the anonymized output to map the field to the generic identifier form
consistent with other cases (e.g., "source=table | convert memk(identifier)" or
with AS when appropriate). Update the testConvertCommand method to include this
new assertEquals using the existing anonymize helper so memk conversion is
validated alongside auto, num, rmcomma, rmunit, and none.
core/src/main/java/org/opensearch/sql/expression/function/BuiltinFunctionName.java
Show resolved
Hide resolved
| public AutoConvertFunction() { | ||
| super(AutoConvertFunction.class); | ||
| } | ||
|
|
||
| public static Object convert(Object value) { | ||
| return new AutoConvertFunction().convertValue(value); | ||
| } |
There was a problem hiding this comment.
Add JavaDoc for the public constructor and convert entrypoint.
These public methods currently lack JavaDoc with @param/@return/@throws. Please add the required JavaDoc for the constructor and static convert method.
As per coding guidelines: "core/src/main/java/**/*.java: Public methods MUST have JavaDoc with @param, @return, and @throws".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/AutoConvertFunction.java`
around lines 11 - 17, Add JavaDoc comments for the public AutoConvertFunction()
constructor and the public static convert(Object value) entrypoint: for
AutoConvertFunction(), add a short description and an `@throws` tag if the
constructor can propagate any runtime exceptions (or explicitly state none are
thrown); for convert(Object) add a description, an `@param` value explaining
accepted input, an `@return` describing the conversion result, and an `@throws`
documenting the runtime exception(s) that may be raised (e.g., if
convertValue(value) fails). Reference the constructor AutoConvertFunction(), the
static method convert(Object), and the instance helper convertValue(Object) in
the comments so maintainers can trace behavior.
| public MemkConvertFunction() { | ||
| super(MemkConvertFunction.class); | ||
| } | ||
|
|
||
| public static Object convert(Object value) { | ||
| return new MemkConvertFunction().convertValue(value); | ||
| } |
There was a problem hiding this comment.
Add JavaDoc for the public constructor and convert entrypoint.
These public methods currently lack JavaDoc with @param/@return/@throws. Please add the required JavaDoc for the constructor and static convert method.
As per coding guidelines: "core/src/main/java/**/*.java: Public methods MUST have JavaDoc with @param, @return, and @throws".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/MemkConvertFunction.java`
around lines 11 - 17, Add JavaDoc comments for the public MemkConvertFunction()
constructor and the public static convert(Object value) entrypoint: for
MemkConvertFunction() document its purpose and any side effects and include
`@throws` if it can throw (or explicitly state none), and for convert(Object
value) document the parameter (`@param` value the input to convert), the return
value (`@return` the converted object or null) and any exceptions it can propagate
(`@throws` ... or state none); reference the convertValue(Object) helper in the
convert method description to clarify behavior.
| public NumConvertFunction() { | ||
| super(NumConvertFunction.class); | ||
| } | ||
|
|
||
| public static Object convert(Object value) { | ||
| return new NumConvertFunction().convertValue(value); | ||
| } |
There was a problem hiding this comment.
Document the public convert API.
The public convert method (and constructor if kept public) should have JavaDoc with @param/@return.
✍️ Example JavaDoc
+ /**
+ * Convert a value using num() semantics.
+ *
+ * `@param` value input value
+ * `@return` converted double, or null if not convertible
+ */
public static Object convert(Object value) {
return new NumConvertFunction().convertValue(value);
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/NumConvertFunction.java`
around lines 11 - 17, Add JavaDoc for the public API: document the
NumConvertFunction() constructor and the static method convert(Object value)
with `@param` describing the value argument, `@return` describing the returned
converted numeric/object result, and `@throws` for any runtime exceptions that
convertValue may propagate; reference the NumConvertFunction.convert(Object)
method and the NumConvertFunction() constructor and ensure the JavaDoc text
matches coding guidelines for public methods.
| public RmcommaConvertFunction() { | ||
| super(RmcommaConvertFunction.class); | ||
| } | ||
|
|
||
| public static Object convert(Object value) { | ||
| return new RmcommaConvertFunction().convertValue(value); | ||
| } |
There was a problem hiding this comment.
Add JavaDoc for the public constructor and convert entrypoint.
These public methods currently lack JavaDoc with @param/@return/@throws. Please add the required JavaDoc for the constructor and static convert method.
As per coding guidelines: "core/src/main/java/**/*.java: Public methods MUST have JavaDoc with @param, @return, and @throws".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/RmcommaConvertFunction.java`
around lines 11 - 17, Add JavaDoc to the public no-arg constructor
RmcommaConvertFunction() and the static entrypoint convert(Object value): for
the constructor, add a short description of the class initialization and include
any `@throws` if construction can fail; for the static convert method, add a
description, an `@param` value describing accepted input types, an `@return`
describing the converted value (or null) and an `@throws` documenting the
exceptions that may be raised during conversion (e.g., IllegalArgumentException
for invalid input or RuntimeException for conversion errors). Reference the
convertValue(Object) call inside convert(Object) so the JavaDoc reflects that
the static method delegates to convertValue for the actual conversion.
| @Test | ||
| public void testCrossClusterConvert() throws IOException { | ||
| enableCalcite(); | ||
|
|
||
| JSONObject result = | ||
| executeQuery( | ||
| String.format( | ||
| "search source=%s | convert auto(balance) | fields balance", | ||
| TEST_INDEX_BANK_REMOTE)); | ||
| verifyColumn(result, columnName("balance")); | ||
|
|
||
| disableCalcite(); | ||
| } | ||
|
|
||
| @Test | ||
| public void testCrossClusterConvertWithAlias() throws IOException { | ||
| enableCalcite(); | ||
|
|
||
| JSONObject result = | ||
| executeQuery( | ||
| String.format( | ||
| "search source=%s | convert auto(balance) AS balance_num | fields balance_num", | ||
| TEST_INDEX_BANK_REMOTE)); | ||
| verifyColumn(result, columnName("balance_num")); | ||
|
|
||
| disableCalcite(); | ||
| } |
There was a problem hiding this comment.
disableCalcite() must be called in a finally block to prevent Calcite state leaking into subsequent tests.
Both testCrossClusterConvert and testCrossClusterConvertWithAlias enable Calcite at the top but only disable it on the happy path. Any exception thrown from executeQuery or verifyColumn (e.g., an assertion failure, an IOException, or a network error) will skip disableCalcite(), leaving Calcite enabled for every test that runs afterwards. This violates test independence and can cause cascading, order-dependent failures in the rest of the suite.
🛡️ Proposed fix: wrap in try-finally
`@Test`
public void testCrossClusterConvert() throws IOException {
enableCalcite();
-
- JSONObject result =
- executeQuery(
- String.format(
- "search source=%s | convert auto(balance) | fields balance",
- TEST_INDEX_BANK_REMOTE));
- verifyColumn(result, columnName("balance"));
-
- disableCalcite();
+ try {
+ JSONObject result =
+ executeQuery(
+ String.format(
+ "search source=%s | convert auto(balance) | fields balance",
+ TEST_INDEX_BANK_REMOTE));
+ verifyColumn(result, columnName("balance"));
+ } finally {
+ disableCalcite();
+ }
}
`@Test`
public void testCrossClusterConvertWithAlias() throws IOException {
enableCalcite();
-
- JSONObject result =
- executeQuery(
- String.format(
- "search source=%s | convert auto(balance) AS balance_num | fields balance_num",
- TEST_INDEX_BANK_REMOTE));
- verifyColumn(result, columnName("balance_num"));
-
- disableCalcite();
+ try {
+ JSONObject result =
+ executeQuery(
+ String.format(
+ "search source=%s | convert auto(balance) AS balance_num | fields balance_num",
+ TEST_INDEX_BANK_REMOTE));
+ verifyColumn(result, columnName("balance_num"));
+ } finally {
+ disableCalcite();
+ }
}As per coding guidelines, "Tests must not rely on execution order; ensure test independence" and "Ensure tests are independent and can run in any order."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@integ-test/src/test/java/org/opensearch/sql/security/CrossClusterSearchIT.java`
around lines 213 - 239, Both testCrossClusterConvert and
testCrossClusterConvertWithAlias enable Calcite with enableCalcite() but call
disableCalcite() only on the happy path; wrap the body of each test in a try {
... } finally { disableCalcite(); } so disableCalcite() always runs even if
executeQuery(...) or verifyColumn(...) throws; locate the two test methods
(testCrossClusterConvert and testCrossClusterConvertWithAlias) and move their
enableCalcite() to before the try and ensure the executeQuery/verifyColumn calls
are inside the try block with disableCalcite() in the finally.
| OTHERSTR: 'OTHERSTR'; | ||
| NULLSTR: 'NULLSTR'; | ||
| TIMEFIELD: 'TIMEFIELD'; | ||
| TIMEFORMAT: 'TIMEFORMAT'; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if TIMEFORMAT is referenced in any parser grammar or Java source files
echo "=== References in .g4 parser files ==="
rg -n 'TIMEFORMAT' -t g4 2>/dev/null || echo "No g4 type found, trying without type filter..."
rg -n 'TIMEFORMAT' --type-list | grep -i g4 > /dev/null && rg -n 'TIMEFORMAT' --type-add 'g4:*.g4' -t g4 || rg -n 'TIMEFORMAT' '\.g4$'
echo ""
echo "=== References in Java files ==="
rg -n 'TIMEFORMAT' -t java 2>/dev/null | head -20
echo ""
echo "=== All references in repo ==="
rg -n 'TIMEFORMAT' | head -30Repository: opensearch-project/sql
Length of output: 766
Remove the unused TIMEFORMAT token or verify it's needed elsewhere.
The TIMEFORMAT token at line 163 is defined in the lexer but has no references anywhere in the parser grammar or Java codebase. If this was added inadvertently or is planned for future use, either integrate it into parser rules or remove it to avoid dead code.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ppl/src/main/antlr/OpenSearchPPLLexer.g4` at line 163, The TIMEFORMAT lexer
token is unused; either remove the TIMEFORMAT token definition from
OpenSearchPPLLexer.g4 to eliminate dead code, or if it's intended, add
corresponding parser grammar rules and/or Java references that consume
TIMEFORMAT (e.g., incorporate it into relevant parser rules that parse time
formats or make the token part of an existing rule), then regenerate the
lexer/parser and update any tests; search for TIMEFORMAT in the project to
confirm no existing usage before deleting or integrate it where parser rules
expect a time format token.
| /* | ||
| * Copyright OpenSearch Contributors | ||
| * SPDX-License-Identifier: Apache-2.0 | ||
| */ | ||
|
|
||
| package org.opensearch.sql.ppl.calcite; | ||
|
|
||
| import org.apache.calcite.rel.RelNode; | ||
| import org.apache.calcite.test.CalciteAssert; | ||
| import org.junit.Test; | ||
|
|
||
| /** Unit tests for PPL convert command. */ | ||
| public class CalcitePPLConvertTest extends CalcitePPLAbstractTest { | ||
|
|
||
| public CalcitePPLConvertTest() { | ||
| super(CalciteAssert.SchemaSpec.SCOTT_WITH_TEMPORAL); | ||
| } | ||
|
|
||
| @Test | ||
| public void testConvertBasic() { | ||
| String ppl = "source=EMP | convert auto(SAL)"; | ||
| RelNode root = getRelNode(ppl); | ||
| String expectedLogical = | ||
| "LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[AUTO($5)]," | ||
| + " COMM=[$6], DEPTNO=[$7])\n" | ||
| + " LogicalTableScan(table=[[scott, EMP]])\n"; | ||
| verifyLogical(root, expectedLogical); | ||
|
|
||
| String expectedSparkSql = | ||
| "SELECT `EMPNO`, `ENAME`, `JOB`, `MGR`, `HIREDATE`, AUTO(`SAL`) `SAL`, `COMM`, `DEPTNO`\n" | ||
| + "FROM `scott`.`EMP`"; | ||
| verifyPPLToSparkSQL(root, expectedSparkSql); | ||
| } | ||
|
|
||
| @Test | ||
| public void testConvertWithAlias() { | ||
| String ppl = "source=EMP | convert auto(SAL) AS salary_num"; | ||
| RelNode root = getRelNode(ppl); | ||
| String expectedLogical = | ||
| "LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5]," | ||
| + " COMM=[$6], DEPTNO=[$7], salary_num=[AUTO($5)])\n" | ||
| + " LogicalTableScan(table=[[scott, EMP]])\n"; | ||
| verifyLogical(root, expectedLogical); | ||
|
|
||
| String expectedSparkSql = | ||
| "SELECT `EMPNO`, `ENAME`, `JOB`, `MGR`, `HIREDATE`, `SAL`, `COMM`, `DEPTNO`, AUTO(`SAL`)" | ||
| + " `salary_num`\n" | ||
| + "FROM `scott`.`EMP`"; | ||
| verifyPPLToSparkSQL(root, expectedSparkSql); | ||
| } | ||
|
|
||
| @Test | ||
| public void testConvertMultipleFunctions() { | ||
| String ppl = "source=EMP | convert auto(SAL), num(COMM)"; | ||
| RelNode root = getRelNode(ppl); | ||
| String expectedLogical = | ||
| "LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[AUTO($5)]," | ||
| + " COMM=[NUM($6)], DEPTNO=[$7])\n" | ||
| + " LogicalTableScan(table=[[scott, EMP]])\n"; | ||
| verifyLogical(root, expectedLogical); | ||
|
|
||
| String expectedSparkSql = | ||
| "SELECT `EMPNO`, `ENAME`, `JOB`, `MGR`, `HIREDATE`, AUTO(`SAL`) `SAL`, NUM(`COMM`) `COMM`," | ||
| + " `DEPTNO`\n" | ||
| + "FROM `scott`.`EMP`"; | ||
| verifyPPLToSparkSQL(root, expectedSparkSql); | ||
| } | ||
|
|
||
| @Test | ||
| public void testConvertMultipleWithAliases() { | ||
| String ppl = "source=EMP | convert auto(SAL) AS salary, num(COMM) AS commission"; | ||
| RelNode root = getRelNode(ppl); | ||
| String expectedLogical = | ||
| "LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5]," | ||
| + " COMM=[$6], DEPTNO=[$7], salary=[AUTO($5)], commission=[NUM($6)])\n" | ||
| + " LogicalTableScan(table=[[scott, EMP]])\n"; | ||
| verifyLogical(root, expectedLogical); | ||
|
|
||
| String expectedSparkSql = | ||
| "SELECT `EMPNO`, `ENAME`, `JOB`, `MGR`, `HIREDATE`, `SAL`, `COMM`, `DEPTNO`, AUTO(`SAL`)" | ||
| + " `salary`, NUM(`COMM`) `commission`\n" | ||
| + "FROM `scott`.`EMP`"; | ||
| verifyPPLToSparkSQL(root, expectedSparkSql); | ||
| } | ||
|
|
||
| @Test | ||
| public void testConvertWithFields() { | ||
| String ppl = "source=EMP | convert auto(SAL) AS salary_num | fields EMPNO, ENAME, salary_num"; | ||
| RelNode root = getRelNode(ppl); | ||
| // Calcite optimizes the two projections into one - this is more efficient | ||
| String expectedLogical = | ||
| "LogicalProject(EMPNO=[$0], ENAME=[$1], salary_num=[AUTO($5)])\n" | ||
| + " LogicalTableScan(table=[[scott, EMP]])\n"; | ||
| verifyLogical(root, expectedLogical); | ||
|
|
||
| String expectedSparkSql = | ||
| "SELECT `EMPNO`, `ENAME`, AUTO(`SAL`) `salary_num`\n" + "FROM `scott`.`EMP`"; | ||
| verifyPPLToSparkSQL(root, expectedSparkSql); | ||
| } | ||
|
|
||
| @Test | ||
| public void testConvertNumFunction() { | ||
| String ppl = "source=EMP | convert num(SAL)"; | ||
| RelNode root = getRelNode(ppl); | ||
| String expectedLogical = | ||
| "LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[NUM($5)]," | ||
| + " COMM=[$6], DEPTNO=[$7])\n" | ||
| + " LogicalTableScan(table=[[scott, EMP]])\n"; | ||
| verifyLogical(root, expectedLogical); | ||
|
|
||
| String expectedSparkSql = | ||
| "SELECT `EMPNO`, `ENAME`, `JOB`, `MGR`, `HIREDATE`, NUM(`SAL`) `SAL`, `COMM`, `DEPTNO`\n" | ||
| + "FROM `scott`.`EMP`"; | ||
| verifyPPLToSparkSQL(root, expectedSparkSql); | ||
| } | ||
|
|
||
| @Test | ||
| public void testConvertRmcommaFunction() { | ||
| String ppl = "source=EMP | convert rmcomma(ENAME)"; | ||
| RelNode root = getRelNode(ppl); | ||
| String expectedLogical = | ||
| "LogicalProject(EMPNO=[$0], ENAME=[RMCOMMA($1)], JOB=[$2], MGR=[$3], HIREDATE=[$4]," | ||
| + " SAL=[$5], COMM=[$6], DEPTNO=[$7])\n" | ||
| + " LogicalTableScan(table=[[scott, EMP]])\n"; | ||
| verifyLogical(root, expectedLogical); | ||
|
|
||
| String expectedSparkSql = | ||
| "SELECT `EMPNO`, RMCOMMA(`ENAME`) `ENAME`, `JOB`, `MGR`, `HIREDATE`, `SAL`, `COMM`," | ||
| + " `DEPTNO`\n" | ||
| + "FROM `scott`.`EMP`"; | ||
| verifyPPLToSparkSQL(root, expectedSparkSql); | ||
| } | ||
|
|
||
| @Test | ||
| public void testConvertRmunitFunction() { | ||
| String ppl = "source=EMP | convert rmunit(ENAME)"; | ||
| RelNode root = getRelNode(ppl); | ||
| String expectedLogical = | ||
| "LogicalProject(EMPNO=[$0], ENAME=[RMUNIT($1)], JOB=[$2], MGR=[$3], HIREDATE=[$4]," | ||
| + " SAL=[$5], COMM=[$6], DEPTNO=[$7])\n" | ||
| + " LogicalTableScan(table=[[scott, EMP]])\n"; | ||
| verifyLogical(root, expectedLogical); | ||
|
|
||
| String expectedSparkSql = | ||
| "SELECT `EMPNO`, RMUNIT(`ENAME`) `ENAME`, `JOB`, `MGR`, `HIREDATE`, `SAL`, `COMM`," | ||
| + " `DEPTNO`\n" | ||
| + "FROM `scott`.`EMP`"; | ||
| verifyPPLToSparkSQL(root, expectedSparkSql); | ||
| } | ||
|
|
||
| @Test | ||
| public void testConvertMemkFunction() { | ||
| String ppl = "source=EMP | convert memk(ENAME)"; | ||
| RelNode root = getRelNode(ppl); | ||
| String expectedLogical = | ||
| "LogicalProject(EMPNO=[$0], ENAME=[MEMK($1)], JOB=[$2], MGR=[$3], HIREDATE=[$4]," | ||
| + " SAL=[$5], COMM=[$6], DEPTNO=[$7])\n" | ||
| + " LogicalTableScan(table=[[scott, EMP]])\n"; | ||
| verifyLogical(root, expectedLogical); | ||
|
|
||
| String expectedSparkSql = | ||
| "SELECT `EMPNO`, MEMK(`ENAME`) `ENAME`, `JOB`, `MGR`, `HIREDATE`, `SAL`, `COMM`," | ||
| + " `DEPTNO`\n" | ||
| + "FROM `scott`.`EMP`"; | ||
| verifyPPLToSparkSQL(root, expectedSparkSql); | ||
| } | ||
|
|
||
| @Test | ||
| public void testConvertNoneFunction() { | ||
| String ppl = "source=EMP | convert none(ENAME)"; | ||
| RelNode root = getRelNode(ppl); | ||
| String expectedLogical = "LogicalTableScan(table=[[scott, EMP]])\n"; | ||
| verifyLogical(root, expectedLogical); | ||
|
|
||
| String expectedSparkSql = "SELECT *\n" + "FROM `scott`.`EMP`"; | ||
| verifyPPLToSparkSQL(root, expectedSparkSql); | ||
| } | ||
|
|
||
| @Test | ||
| public void testConvertWithWhere() { | ||
| String ppl = "source=EMP | where DEPTNO = 10 | convert auto(SAL)"; | ||
| RelNode root = getRelNode(ppl); | ||
| String expectedLogical = | ||
| "LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[AUTO($5)]," | ||
| + " COMM=[$6], DEPTNO=[$7])\n" | ||
| + " LogicalFilter(condition=[=($7, 10)])\n" | ||
| + " LogicalTableScan(table=[[scott, EMP]])\n"; | ||
| verifyLogical(root, expectedLogical); | ||
|
|
||
| String expectedSparkSql = | ||
| "SELECT `EMPNO`, `ENAME`, `JOB`, `MGR`, `HIREDATE`, AUTO(`SAL`) `SAL`, `COMM`, `DEPTNO`\n" | ||
| + "FROM `scott`.`EMP`\n" | ||
| + "WHERE `DEPTNO` = 10"; | ||
| verifyPPLToSparkSQL(root, expectedSparkSql); | ||
| } | ||
|
|
||
| @Test | ||
| public void testConvertWithSort() { | ||
| String ppl = "source=EMP | convert auto(SAL) AS salary_num | sort - salary_num | head 3"; | ||
| RelNode root = getRelNode(ppl); | ||
| String expectedLogical = | ||
| "LogicalSort(sort0=[$8], dir0=[DESC-nulls-last], fetch=[3])\n" | ||
| + " LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4]," | ||
| + " SAL=[$5], COMM=[$6], DEPTNO=[$7], salary_num=[AUTO($5)])\n" | ||
| + " LogicalTableScan(table=[[scott, EMP]])\n"; | ||
| verifyLogical(root, expectedLogical); | ||
|
|
||
| String expectedSparkSql = | ||
| "SELECT `EMPNO`, `ENAME`, `JOB`, `MGR`, `HIREDATE`, `SAL`, `COMM`, `DEPTNO`, AUTO(`SAL`)" | ||
| + " `salary_num`\n" | ||
| + "FROM `scott`.`EMP`\n" | ||
| + "ORDER BY 9 DESC\n" | ||
| + "LIMIT 3"; | ||
| verifyPPLToSparkSQL(root, expectedSparkSql); | ||
| } | ||
|
|
||
| @Test | ||
| public void testConvertWithStats() { | ||
| String ppl = "source=EMP | convert auto(SAL) AS salary_num | stats avg(salary_num) by DEPTNO"; | ||
| RelNode root = getRelNode(ppl); | ||
| String expectedLogical = | ||
| "LogicalProject(avg(salary_num)=[$1], DEPTNO=[$0])\n" | ||
| + " LogicalAggregate(group=[{0}], avg(salary_num)=[AVG($1)])\n" | ||
| + " LogicalProject(DEPTNO=[$7], salary_num=[AUTO($5)])\n" | ||
| + " LogicalTableScan(table=[[scott, EMP]])\n"; | ||
| verifyLogical(root, expectedLogical); | ||
|
|
||
| String expectedSparkSql = | ||
| "SELECT AVG(AUTO(`SAL`)) `avg(salary_num)`, `DEPTNO`\n" | ||
| + "FROM `scott`.`EMP`\n" | ||
| + "GROUP BY `DEPTNO`"; | ||
| verifyPPLToSparkSQL(root, expectedSparkSql); | ||
| } | ||
|
|
||
| @Test | ||
| public void testConvertAllFunctions() { | ||
| String ppl = | ||
| "source=EMP | convert auto(SAL) AS sal_auto, num(COMM) AS comm_num, rmcomma(ENAME) AS" | ||
| + " name_clean, rmunit(JOB) AS job_clean, none(EMPNO) AS empno_same"; | ||
| RelNode root = getRelNode(ppl); | ||
| String expectedLogical = | ||
| "LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5]," | ||
| + " COMM=[$6], DEPTNO=[$7], sal_auto=[AUTO($5)], comm_num=[NUM($6)]," | ||
| + " name_clean=[RMCOMMA($1)], job_clean=[RMUNIT($2)], empno_same=[$0])\n" | ||
| + " LogicalTableScan(table=[[scott, EMP]])\n"; | ||
| verifyLogical(root, expectedLogical); | ||
|
|
||
| String expectedSparkSql = | ||
| "SELECT `EMPNO`, `ENAME`, `JOB`, `MGR`, `HIREDATE`, `SAL`, `COMM`, `DEPTNO`, AUTO(`SAL`)" | ||
| + " `sal_auto`, NUM(`COMM`) `comm_num`, RMCOMMA(`ENAME`) `name_clean`, RMUNIT(`JOB`)" | ||
| + " `job_clean`, `EMPNO` `empno_same`\n" | ||
| + "FROM `scott`.`EMP`"; | ||
| verifyPPLToSparkSQL(root, expectedSparkSql); | ||
| } | ||
|
|
||
| @Test | ||
| public void testConvertAutoWithMemoryField() { | ||
| String ppl = "source=EMP | convert auto(JOB) AS memory_size"; | ||
| RelNode root = getRelNode(ppl); | ||
| String expectedLogical = | ||
| "LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5]," | ||
| + " COMM=[$6], DEPTNO=[$7], memory_size=[AUTO($2)])\n" | ||
| + " LogicalTableScan(table=[[scott, EMP]])\n"; | ||
| verifyLogical(root, expectedLogical); | ||
|
|
||
| String expectedSparkSql = | ||
| "SELECT `EMPNO`, `ENAME`, `JOB`, `MGR`, `HIREDATE`, `SAL`, `COMM`, `DEPTNO`, AUTO(`JOB`)" | ||
| + " `memory_size`\n" | ||
| + "FROM `scott`.`EMP`"; | ||
| verifyPPLToSparkSQL(root, expectedSparkSql); | ||
| } | ||
| } |
There was a problem hiding this comment.
Missing required NULL-input, boundary-condition, and error-condition tests for all five conversion functions
The coding guidelines for **/+(*Test|*IT).java files explicitly mandate:
- NULL input tests for all new functions
- Boundary condition tests (min/max values, empty inputs)
- Error condition tests (invalid inputs, exceptions)
None of these are present in this class. The five conversion functions (auto, num, rmcomma, rmunit, memk) each need at least one of each category. Examples of what should be added:
// NULL input — verifying SQL generation/plan doesn't blow up on nullable field
`@Test`
public void testConvertNumWithNullInput() {
// COMM is a nullable field in the SCOTT EMP schema
String ppl = "source=EMP | convert num(COMM)";
RelNode root = getRelNode(ppl);
// assert plan handles null-bearing field correctly
verifyLogical(root, ...);
verifyPPLToSparkSQL(root, ...);
}
// Boundary condition — empty-string equivalent: applying rmcomma to a field
// with potentially empty-string values
`@Test`
public void testConvertRmcommaWithEmptyString() { ... }
// Error condition — converting a field with a nonsensical function argument
// or verifying an exception is thrown on illegal input
`@Test`(expected = SomeParseException.class)
public void testConvertInvalidFunctionThrows() {
String ppl = "source=EMP | convert unknown_fn(SAL)";
getRelNode(ppl);
}As per coding guidelines, NULL input tests must be included for all new functions and boundary condition tests (min/max values, empty inputs) for all new functions.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.java`
around lines 1 - 272, The test class CalcitePPLConvertTest lacks required
NULL-input, boundary-condition, and error-condition tests for the five
conversion functions (auto, num, rmcomma, rmunit, memk); add unit tests that
exercise each function with (1) a nullable field (NULL-input) e.g., use COMM or
another nullable column and assert getRelNode(ppl) + verifyLogical(root, ...)
and verifyPPLToSparkSQL(root, ...), (2) a boundary/empty input case (e.g.,
rmcomma applied to fields that can be empty or empty-string equivalent) and
assert plan/SQL, and (3) an error-condition test that passes an invalid function
or argument and expects a parse/validation exception (use `@Test`(expected =
SomeParseException.class) around getRelNode(ppl)); implement these tests in
CalcitePPLConvertTest using the existing helpers getRelNode, verifyLogical, and
verifyPPLToSparkSQL and name them clearly like testConvertAutoWithNullInput,
testConvertNumBoundaryEmpty, testConvertInvalidFunctionThrows, etc.
| public void testConvertAutoWithMemoryField() { | ||
| String ppl = "source=EMP | convert auto(JOB) AS memory_size"; | ||
| RelNode root = getRelNode(ppl); | ||
| String expectedLogical = | ||
| "LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5]," | ||
| + " COMM=[$6], DEPTNO=[$7], memory_size=[AUTO($2)])\n" | ||
| + " LogicalTableScan(table=[[scott, EMP]])\n"; | ||
| verifyLogical(root, expectedLogical); | ||
|
|
||
| String expectedSparkSql = | ||
| "SELECT `EMPNO`, `ENAME`, `JOB`, `MGR`, `HIREDATE`, `SAL`, `COMM`, `DEPTNO`, AUTO(`JOB`)" | ||
| + " `memory_size`\n" | ||
| + "FROM `scott`.`EMP`"; | ||
| verifyPPLToSparkSQL(root, expectedSparkSql); | ||
| } |
There was a problem hiding this comment.
Misleading test name: testConvertAutoWithMemoryField
The test applies auto(JOB) with alias memory_size, but JOB is a text job-title field in the SCOTT schema — not a memory-related field. The name implies memory-specific behavior (e.g., "10k", "1.5M" strings) is being exercised, which it isn't. This is misleading for anyone trying to understand the test intent.
Consider renaming to testConvertAutoWithAlias (since testConvertWithAlias already covers SAL with auto, this could be testConvertAutoAliasOnStringField) or updating the test to actually use a field resembling a memory value.
✏️ Suggested rename
- public void testConvertAutoWithMemoryField() {
- String ppl = "source=EMP | convert auto(JOB) AS memory_size";
+ public void testConvertAutoAliasOnStringField() {
+ String ppl = "source=EMP | convert auto(JOB) AS job_converted";As per coding guidelines, Test naming should follow pattern test<Functionality><Condition>.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.java`
around lines 257 - 271, The test method testConvertAutoWithMemoryField is
misnamed because it uses auto(JOB) aliased as memory_size while JOB is a string
job-title field; rename the test to reflect aliasing behavior (e.g.,
testConvertAutoAliasOnStringField or testConvertAutoWithAlias) or change the
input to use a memory-like field (e.g., replace JOB with a numeric/string
memory_field that contains values like "10k") so the name matches behavior;
update the method name testConvertAutoWithMemoryField and any associated
expected strings (retain auto(JOB) and memory_size if renaming, or replace
auto(JOB)/memory_size occurrences if you change the field) to keep assertions
consistent.
Signed-off-by: ritvibhatt <53196324+ritvibhatt@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (2)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (1)
984-1093: Consider validating allowed convert functions.If
convertis meant to accept only the conversion UDFs (auto/num/rmcomma/rmunit/none/memk), this currently permits any single-argument function name. Consider an allowlist check (or explicitly document that arbitrary UDFs are accepted).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java` around lines 984 - 1093, The convert implementation accepts any single-argument function name; restrict it to the allowed conversion UDFs by adding an allowlist check in processFunctionConversion (before calling PPLFuncImpTable.INSTANCE.resolve): define the set of permitted names (e.g., "auto","num","rmcomma","rmunit","none","memk") and if functionName is not in that set throw a SemanticCheckException with a clear message; keep the rest of processFunctionConversion logic (replacements/additions) intact and update any related tests or docs if you intend to allow additional functions.ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java (1)
1206-1214: PreferObjects::nonNullover inline null check; confirm empty-list behavior
Objectsis already imported. The inline lambda.filter(conversion -> conversion != null)should be.filter(Objects::nonNull). Additionally, if every conversion is anone(field)with the same source and target,conversionswill be empty. Verify that aConvertnode with an empty conversion list is handled gracefully downstream (analyzer, physical operator, Calcite planner).♻️ Proposed refactor
- List<Let> conversions = - ctx.convertFunction().stream() - .map(this::buildConversion) - .filter(conversion -> conversion != null) - .collect(Collectors.toList()); + List<Let> conversions = + ctx.convertFunction().stream() + .map(this::buildConversion) + .filter(Objects::nonNull) + .collect(Collectors.toList());🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java` around lines 1206 - 1214, Replace the inline null-check filter in visitConvertCommand by using Objects::nonNull (i.e., change .filter(conversion -> conversion != null) to .filter(Objects::nonNull)) so the stream is cleaner; while here, confirm behavior when buildConversion yields no conversions (e.g., all none(field)) by verifying downstream handling of a Convert constructed with an empty conversions list—trace usages of Convert and the conversions list through the analyzer, physical operator builders, and Calcite planner and either ensure they accept an empty list or add a guard in visitConvertCommand to produce an appropriate no-op/validation/error if empty conversions are invalid.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@integ-test/src/test/java/org/opensearch/sql/security/CalciteCrossClusterSearchIT.java`:
- Around line 460-461: Add a missing closing brace to end the method
testCrossClusterConvertWithAlias and add the JUnit `@Test` annotation immediately
before the method declaration of testCrossClusterMvExpandBasic so the first
method is properly closed and the second is recognized as a test; locate the
methods testCrossClusterConvertWithAlias and testCrossClusterMvExpandBasic in
CalciteCrossClusterSearchIT and insert a single closing '}' after the
verifySchema(...) call and add '@Test' on the line directly above the public
void testCrossClusterMvExpandBasic() declaration.
In `@ppl/src/main/antlr/OpenSearchPPLParser.g4`:
- Around line 543-549: Add negative/grammar-level tests for
convertCommand/convertFunction in CalcitePPLConvertTest: create test cases that
feed malformed convert syntaxes (e.g., missing field argument inside
LT_PRTHS...RT_PRTHS, malformed function names, missing LT_PRTHS/RT_PRTHS,
stray/extra COMMA, or misplaced AS alias) and assert the parser fails (throws a
RecognitionException or records parse errors) rather than building an AST;
reference convertCommand, convertFunction, visitConvertCommand, and
buildConversion to ensure the tests target parsing failures (not just
semantic/AST logic) and cover edge cases alongside the existing 14 positive
tests.
In `@ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java`:
- Around line 1225-1229: The current check for the "none" no-op uses toString()
comparison between fieldArg and targetField.getField(), which is fragile;
instead detect the no-op by checking the function context alias (funcCtx.alias
== null) because determineTargetField returns the same field when no alias is
provided. Replace the ternary that returns null or new Let(targetField,
fieldArg) with a condition that if "none".equalsIgnoreCase(functionName) and
funcCtx.alias == null then return null, otherwise return new Let(targetField,
fieldArg); update the block referencing functionName, fieldArg, targetField (and
Let) accordingly so the redundant new Let(field, field) is not created when
alias is absent.
- Around line 1221-1231: The parser extracts functionName via
funcCtx.functionName.getText() but passes the raw string into AstDSL.function;
keep the equalsIgnoreCase("none") check as-is but normalize the name before
creating the function node: compute a normalizedName (e.g.,
functionName.toLowerCase()) and use that when calling
AstDSL.function(normalizedName, fieldArg) so the Let creation and
AstDSL.function use a consistent normalized function name (references:
functionName, funcCtx.functionName.getText(), AstDSL.function,
determineTargetField, Let).
In `@ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java`:
- Around line 463-495: In visitConvert, handle the case where
conversion.getExpression() is a Field (the "none(field) AS alias" case) so
functionName isn’t left empty: if conversion.getExpression() is a Field, set
functionName = "none", set actualSourceField to that Field's string, and set
fields to MASK_COLUMN (so the resulting format becomes "none(***) AS ***"); keep
the existing logic for Function nodes unchanged and preserve the existing
asClause/targetField logic that compares targetField to actualSourceField.
---
Duplicate comments:
In `@integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java`:
- Around line 232-241: The test currently only verifies the result inside the
catch block, so when executeQuery succeeds the test does not assert anything;
modify testConvertCommand to capture the successful response into the existing
JSONObject result and call verifyQuery(result) on the success path as well:
inside the try block assign the response from executeQuery to result (or parse
the returned response body into result using TestUtils.getResponseBody if
executeQuery returns a response object) and then call verifyQuery(result); keep
the existing catch block for ResponseException to preserve the error-path
behavior.
In `@ppl/src/main/antlr/OpenSearchPPLLexer.g4`:
- Line 164: The TIMEFORMAT lexer token is currently unused in the grammar;
either remove its definition from OpenSearchPPLLexer.g4 or wire it into parser
rules that consume it (e.g., add a parser rule like timeFormatSpec : TIMEFORMAT
... or use TIMEFORMAT in existing rules such as functionCall, literal, or format
clauses). Locate the TIMEFORMAT token in OpenSearchPPLLexer.g4 and decide:
delete the 'TIMEFORMAT' token line and run a project-wide search for TIMEFORMAT
to ensure no references remain, or add the appropriate parser rule(s) and update
any callers (e.g., format-related rules) so the token is actually consumed by
the parser.
---
Nitpick comments:
In `@core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java`:
- Around line 984-1093: The convert implementation accepts any single-argument
function name; restrict it to the allowed conversion UDFs by adding an allowlist
check in processFunctionConversion (before calling
PPLFuncImpTable.INSTANCE.resolve): define the set of permitted names (e.g.,
"auto","num","rmcomma","rmunit","none","memk") and if functionName is not in
that set throw a SemanticCheckException with a clear message; keep the rest of
processFunctionConversion logic (replacements/additions) intact and update any
related tests or docs if you intend to allow additional functions.
In `@ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java`:
- Around line 1206-1214: Replace the inline null-check filter in
visitConvertCommand by using Objects::nonNull (i.e., change .filter(conversion
-> conversion != null) to .filter(Objects::nonNull)) so the stream is cleaner;
while here, confirm behavior when buildConversion yields no conversions (e.g.,
all none(field)) by verifying downstream handling of a Convert constructed with
an empty conversions list—trace usages of Convert and the conversions list
through the analyzer, physical operator builders, and Calcite planner and either
ensure they accept an empty list or add a guard in visitConvertCommand to
produce an appropriate no-op/validation/error if empty conversions are invalid.
| verifySchema(result, schema("balance_num", "double")); | ||
| public void testCrossClusterMvExpandBasic() throws IOException { |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
fd -I "CalciteCrossClusterSearchIT.java" --exec sed -n '455,475p' {}Repository: opensearch-project/sql
Length of output: 1021
Compile error: missing closing } for testCrossClusterConvertWithAlias; @Test annotation also missing from testCrossClusterMvExpandBasic.
The file is missing two required lines between line 460 and line 461:
- A closing brace
}to endtestCrossClusterConvertWithAlias— without it the method declarationtestCrossClusterMvExpandBasicis parsed as a statement insidetestCrossClusterConvertWithAlias's body, causing a Java compilation error. - The
@Testannotation beforetestCrossClusterMvExpandBasic— without it the method will not be recognized as a test and will not run.
Proposed fix
verifySchema(result, schema("balance_num", "double"));
+ }
+
+ `@Test`
public void testCrossClusterMvExpandBasic() throws IOException {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| verifySchema(result, schema("balance_num", "double")); | |
| public void testCrossClusterMvExpandBasic() throws IOException { | |
| verifySchema(result, schema("balance_num", "double")); | |
| } | |
| `@Test` | |
| public void testCrossClusterMvExpandBasic() throws IOException { |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@integ-test/src/test/java/org/opensearch/sql/security/CalciteCrossClusterSearchIT.java`
around lines 460 - 461, Add a missing closing brace to end the method
testCrossClusterConvertWithAlias and add the JUnit `@Test` annotation immediately
before the method declaration of testCrossClusterMvExpandBasic so the first
method is properly closed and the second is recognized as a test; locate the
methods testCrossClusterConvertWithAlias and testCrossClusterMvExpandBasic in
CalciteCrossClusterSearchIT and insert a single closing '}' after the
verifySchema(...) call and add '@Test' on the line directly above the public
void testCrossClusterMvExpandBasic() declaration.
| convertCommand | ||
| : CONVERT convertFunction (COMMA? convertFunction)* | ||
| ; | ||
|
|
||
| convertFunction | ||
| : functionName = ident LT_PRTHS fieldExpression RT_PRTHS (AS alias = fieldExpression)? | ||
| ; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
rg -n 'convertCommand|convertFunction|visitConvert' -g '*.java' -g '*.g4' -A 2Repository: opensearch-project/sql
Length of output: 7268
🏁 Script executed:
rg -n 'convert' -g '*Test*.java' -g '*Parser*.g4' --max-count 30Repository: opensearch-project/sql
Length of output: 42406
🏁 Script executed:
# Search for AST builder classes and convert-related implementations
find . -name '*Parser*.java' -o -name '*Visitor*.java' -o -name '*Builder*.java' | grep -i ast | head -20Repository: opensearch-project/sql
Length of output: 1664
🏁 Script executed:
# Check for existing command patterns in grammar file
rg -n 'Command\s*:' ppl/src/main/antlr/OpenSearchPPLParser.g4 -A 5 | head -50Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Check the buildConversion method implementation
rg -n 'buildConversion' ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java -A 15Repository: opensearch-project/sql
Length of output: 1059
🏁 Script executed:
# Check CalcitePPLConvertTest to see test cases and coverage
cat ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.java | head -100Repository: opensearch-project/sql
Length of output: 3902
🏁 Script executed:
# Look for negative/error test cases for convert
rg -n 'convert.*invalid|convert.*error|convert.*negative|convert.*exception' ppl/src/test -iRepository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Check for grammar rule validation tests
rg -n 'convertFunction|convertCommand' ppl/src/test/java -B 2 -A 5Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Get full CalcitePPLConvertTest file to see all test methods
wc -l ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.javaRepository: opensearch-project/sql
Length of output: 145
🏁 Script executed:
# See all test methods in CalcitePPLConvertTest
rg -n '@Test|public void test' ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.javaRepository: opensearch-project/sql
Length of output: 926
🏁 Script executed:
# Check for any exception/negative tests or edge cases
grep -n 'throw\|Exception\|assertThrow\|invalid\|error' ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.javaRepository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Look for parser-level tests (grammar-level tests)
find ppl -path '*test*' -name '*Parser*Test*.java' -o -path '*test*' -name '*Grammar*Test*.java' | xargs grep -l 'convert' 2>/dev/null || echo "No parser grammar tests found"Repository: opensearch-project/sql
Length of output: 95
Add negative/grammar-level tests for convertCommand syntax validation.
AST builder support is confirmed—visitConvertCommand properly handles convertFunction contexts and aliasing via buildConversion. However, CalcitePPLConvertTest covers only positive cases (14 test methods). Add tests for invalid syntax (e.g., missing field argument, malformed function names, edge cases) at the parser/grammar level to validate error handling during parsing.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ppl/src/main/antlr/OpenSearchPPLParser.g4` around lines 543 - 549, Add
negative/grammar-level tests for convertCommand/convertFunction in
CalcitePPLConvertTest: create test cases that feed malformed convert syntaxes
(e.g., missing field argument inside LT_PRTHS...RT_PRTHS, malformed function
names, missing LT_PRTHS/RT_PRTHS, stray/extra COMMA, or misplaced AS alias) and
assert the parser fails (throws a RecognitionException or records parse errors)
rather than building an AST; reference convertCommand, convertFunction,
visitConvertCommand, and buildConversion to ensure the tests target parsing
failures (not just semantic/AST logic) and cover edge cases alongside the
existing 14 positive tests.
| String functionName = funcCtx.functionName.getText(); | ||
| UnresolvedExpression fieldArg = internalVisitExpression(funcCtx.fieldExpression(0)); | ||
| Field targetField = determineTargetField(funcCtx, fieldArg); | ||
|
|
||
| if ("none".equalsIgnoreCase(functionName)) { | ||
| return fieldArg.toString().equals(targetField.getField().toString()) | ||
| ? null | ||
| : new Let(targetField, fieldArg); | ||
| } | ||
|
|
||
| return new Let(targetField, AstDSL.function(functionName, fieldArg)); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, locate key files related to function resolution and UDF registry
find . -type f -name "*.java" | grep -iE "(function|resolver|udf|registry)" | head -20Repository: opensearch-project/sql
Length of output: 1820
🏁 Script executed:
# Search for AstDSL.function usage to understand the expected input
rg -n "AstDSL\.function" --type=java -B 2 -A 2 | head -50Repository: opensearch-project/sql
Length of output: 5510
🏁 Script executed:
# Look for FunctionRegistry or similar registration mechanisms
rg -n "class.*FunctionRegistry\|class.*FunctionResolver\|registerFunction" --type=java -B 1 -A 5Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Check the buildConversion method context and surrounding code
cat -n ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java | sed -n '1215,1235p'Repository: opensearch-project/sql
Length of output: 1052
🏁 Script executed:
# Find the Function class definition to understand how it's used
find . -type f -name "Function.java" -path "*/ast/*" | head -5Repository: opensearch-project/sql
Length of output: 134
🏁 Script executed:
# Look for PPL-specific function resolution or analysis
find . -type f -name "*.java" -path "*/ppl/*" | grep -iE "(analysis|resolver|function)" | head -20Repository: opensearch-project/sql
Length of output: 1594
🏁 Script executed:
# Search for where convert functions are defined/used in PPL
rg -n "convert\|AUTO\|HEX" --type=java -i ppl/src/ -B 2 -A 2 | head -80Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Check how Function expressions are resolved during analysis
rg -n "resolveFunction\|visitFunction\|analyzeFunction" --type=java -B 2 -A 5 | head -80Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Read the Function class to understand function name handling
cat -n core/src/main/java/org/opensearch/sql/ast/expression/Function.java | head -100Repository: opensearch-project/sql
Length of output: 1515
🏁 Script executed:
# Check the ConversionFunctionIT test to see how convert is tested
cat -n integ-test/src/test/java/org/opensearch/sql/ppl/ConversionFunctionIT.java | head -150Repository: opensearch-project/sql
Length of output: 3867
🏁 Script executed:
# Look for PPL analysis/type resolution code
find ./ppl -type f -name "*.java" -path "*/analysis/*" -o -path "*/calcite/*" | grep -v test | head -15Repository: opensearch-project/sql
Length of output: 147
🏁 Script executed:
# Search for actual "convert" function usage in tests
rg -n "convert\s*\(" --type=java -i ppl/src/ integ-test/src/ -B 2 -A 2 | head -100Repository: opensearch-project/sql
Length of output: 4457
🏁 Script executed:
# Look for the PPL analyzer/visitor that processes Function nodes
find ./ppl -type f -name "*Analyzer.java" -o -name "*Visitor.java" | grep -v testRepository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Check how functions are handled in PPL's expression analysis
rg -n "visitFunction\|analyzeFunction" --type=java -i ppl/src/ -B 2 -A 5 | head -100Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Look for function resolution in the core module
rg -n "resolveFunction\|getFunctionName\|toLowerCase" --type=java core/src/ -B 2 -A 2 | grep -iE "(function|name)" | head -50Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Find the Let class definition
find . -type f -name "Let.java" | head -5Repository: opensearch-project/sql
Length of output: 129
🏁 Script executed:
# Search for convert function implementations (rmcomma, rmunit, auto, hex, etc.)
rg -n "rmcomma\|rmunit\|auto\|hex" --type=java -i | grep -iE "convert|function|class" | head -50Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Look for how Let expressions and functions are analyzed in PPL
rg -n "class.*PPLAnalyzer\|visitLet\|visitFunction" --type=java ppl/src/ -B 1 -A 5 | head -100Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Check if there's any function registry in PPL
find ./ppl -type f -name "*.java" | xargs grep -l "FunctionRegistry\|registerFunction" | head -10Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Read the Let class to understand how it stores the function
cat -n core/src/main/java/org/opensearch/sql/ast/expression/Let.javaRepository: opensearch-project/sql
Length of output: 2384
🏁 Script executed:
# Search for where convert functions are actually defined/resolved
rg -n "rmcomma\|rmunit\|convert" --type=java ppl/src/ -i | head -30Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Look for how functions are processed in PPL after AST parsing
rg -n "class PPL" --type=java ppl/src/main/java | head -10Repository: opensearch-project/sql
Length of output: 640
🏁 Script executed:
# Check what happens in the visitor that processes Function expressions
rg -n "public.*Function\|visitFunction" --type=java core/src/main/java/org/opensearch/sql -B 1 -A 5 | head -80Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Look for where Function expressions are converted to Calcite nodes
rg -n "visitFunction\|Function.*Calcite\|RexCall" --type=java ppl/src/ core/src/ -B 2 -A 3 | head -100Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Search for Calcite function resolution in core
rg -n "FunctionRegistry\|SqlOperator\|function.*toLowerCase" --type=java core/src/ -i -B 2 -A 2 | head -80Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Check the grammar file for convert command
find . -type f -name "*.g4" | xargs grep -n "convert\|functionName" -iRepository: opensearch-project/sql
Length of output: 28202
🏁 Script executed:
# Look for where functions get passed to Calcite's function registry
rg -n "getFunction\|getFunctionRegistry\|resolveFunction" --type=java -i | head -60Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Find the ident rule definition in PPL grammar
rg -n "^ident\s*:" ppl/src/main/antlr/OpenSearchPPLParser.g4 -A 3Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Look for where Function expressions are processed in PPL/Calcite pipeline
rg -n "class.*ExpressionAnalyzer\|visitExpression\|Function.*visitor" ppl/src/ --type=java -B 2 -A 5 | head -80Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Check for Calcite's RexBuilder or function conversion
rg -n "RexBuilder\|createCall\|getFunctionRegistry\|lookupFunction" --type=java ppl/src/ core/src/ -i -B 2 -A 3 | head -80Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Look for toLowerCase in function processing
rg -n "toLowerCase\|toUpperCase\|normalize.*function" --type=java ppl/src/main/java -i -B 2 -A 2Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Look at tests to see if convert functions work with mixed case
grep -r "convert.*[A-Z]" ppl/src/test --include="*.java" -i | head -20Repository: opensearch-project/sql
Length of output: 2735
🏁 Script executed:
# Check the SQL parser to see how it handles function names
rg -n "functionName\|function.*toLowerCase" sql/src/main/java --type=java -B 2 -A 2 | head -60Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Search for AstBuilder or function visitor in core that might normalize
rg -n "visitFunction\|Function.*Visitor" core/src/main/java --type=java -B 2 -A 8 | head -100Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Look for UnifiedFunctionRepository which might be the registry
cat -n api/src/main/java/org/opensearch/sql/api/function/UnifiedFunctionRepository.java | head -100Repository: opensearch-project/sql
Length of output: 3473
🏁 Script executed:
# Read the convert test to see what function names are used and tested
cat -n ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.javaRepository: opensearch-project/sql
Length of output: 12886
🏁 Script executed:
# Search for where convert function names are actually resolved/validated
rg -n "auto\|rmcomma\|rmunit" --type=java ppl/src/ core/src/ -B 3 -A 3 | head -100Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Check how the Function expression in the AST is processed to become a Calcite expression
rg -n "visitFunction\|ExpressionConverter" --type=java ppl/src/main/java -B 2 -A 8 | head -100Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Look at the end of CalcitePPLConvertTest more carefully and check for any case-sensitive tests
tail -n 50 ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLConvertTest.javaRepository: opensearch-project/sql
Length of output: 2289
🏁 Script executed:
# Search for where Function expressions are converted to Calcite SqlCall/RexCall
rg -n "class.*RelBuilder\|class.*ExpressionConverter\|public.*visit.*Function" ppl/src/main/java --type=java -B 2 -A 10 | head -120Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Look for PPLBuiltinOperators to see how convert functions are registered
find . -type f -name "PPLBuiltinOperators.java" -o -name "*BuiltinOperators.java" | head -5Repository: opensearch-project/sql
Length of output: 150
🏁 Script executed:
# Check if there are any tests with mixed-case function names
find ppl/src/test -name "*.java" -exec grep -l "convert.*[A-Z][a-z]\|AUTO\|Auto" {} \;Repository: opensearch-project/sql
Length of output: 615
🏁 Script executed:
# Read PPLBuiltinOperators to see how convert functions are registered
cat -n core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java | head -200Repository: opensearch-project/sql
Length of output: 13606
🏁 Script executed:
# Look for where function names are normalized (uppercase in tests)
rg -n "toUpperCase\|toLowerCase\|getOperatorName\|getName" core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java --type=java -B 2 -A 2Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Search for where Calcite function resolution happens for PPL
rg -n "class.*PPL.*Analyzer\|PPL.*TypeResolver\|getFunction" ppl/src/main/java --type=java -B 2 -A 5 | head -80Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Continue reading PPLBuiltinOperators to see convert function registrations
sed -n '200,400p' core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.javaRepository: opensearch-project/sql
Length of output: 9729
🏁 Script executed:
# Search for where AUTO, NUM, RMCOMMA, RMUNIT functions are registered
rg -n "AUTO\|RMCOMMA\|RMUNIT" core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java --type=javaRepository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Look at one of the convert function implementations
cat -n core/src/main/java/org/opensearch/sql/expression/function/udf/AutoConvertFunction.java | head -60Repository: opensearch-project/sql
Length of output: 953
🏁 Script executed:
# Search for all tests with mixed-case convert functions
find ppl/src/test integ-test/src/test -name "*.java" -exec grep -l "convert.*[A-Z][a-z]\|convert.*AUTO\|convert.*Auto" {} \;Repository: opensearch-project/sql
Length of output: 1359
🏁 Script executed:
# Look at how the PPLBuiltinOperators functions are registered - check for AUTO, NUM, etc.
sed -n '300,500p' core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java | grep -E "AUTO|NUM|RMCOMMA|RMUNIT|MEMK"Repository: opensearch-project/sql
Length of output: 750
🏁 Script executed:
# Check if there's a resolver/analyzer that uses UnifiedFunctionRepository for PPL
rg -n "UnifiedFunctionRepository\|loadFunction" ppl/src/main/java --type=java -B 3 -A 3Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Look for where Function nodes from AST are converted to Calcite operators
rg -n "class.*ExpressionConverter\|convertFunction\|Function.*expression" ppl/src/main/java --type=java -B 2 -A 5 | head -100Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Check the ConvertCommandIT to see if there are any mixed-case tests
cat -n integ-test/src/test/java/org/opensearch/sql/ppl/ConvertCommandIT.java | head -150Repository: opensearch-project/sql
Length of output: 2897
🏁 Script executed:
# Look for where Calcite's operator table is used in PPL
rg -n "getOperatorList\|getSqlOperator\|operatorTable" ppl/src/main/java --type=java -B 2 -A 5Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Check if there's any test with mixed-case convert function names
grep -r "convert.*[A-Z][a-z]\|[Aa]uto\|[Nn]um" integ-test/src/test/java/org/opensearch/sql/ppl/ConvertCommandIT.javaRepository: opensearch-project/sql
Length of output: 939
🏁 Script executed:
# Look at PPL FunctionTypeTest
cat -n ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLFunctionTypeTest.java | head -100Repository: opensearch-project/sql
Length of output: 4725
Normalize function name to lowercase before passing to AstDSL.function
The function name from funcCtx.functionName.getText() is passed to AstDSL.function() without normalization. While the "none" check correctly uses equalsIgnoreCase(), the function node itself uses the raw, un-normalized string. PPL's convert functions are registered in uppercase (AUTO, NUM, RMCOMMA, RMUNIT, MEMK) in PPLBuiltinOperators. Although downstream case-insensitive resolution may mask this issue, normalizing at parse time ensures consistency and prevents potential failures if the resolver behavior changes.
🛠️ Suggested fix
- String functionName = funcCtx.functionName.getText();
+ String functionName = funcCtx.functionName.getText().toLowerCase(Locale.ROOT);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java` around lines
1221 - 1231, The parser extracts functionName via funcCtx.functionName.getText()
but passes the raw string into AstDSL.function; keep the
equalsIgnoreCase("none") check as-is but normalize the name before creating the
function node: compute a normalizedName (e.g., functionName.toLowerCase()) and
use that when calling AstDSL.function(normalizedName, fieldArg) so the Let
creation and AstDSL.function use a consistent normalized function name
(references: functionName, funcCtx.functionName.getText(), AstDSL.function,
determineTargetField, Let).
| if ("none".equalsIgnoreCase(functionName)) { | ||
| return fieldArg.toString().equals(targetField.getField().toString()) | ||
| ? null | ||
| : new Let(targetField, fieldArg); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Find the Field class definition
find . -type f -name "Field.java" | head -20Repository: opensearch-project/sql
Length of output: 198
🏁 Script executed:
# Search for the Field class in the ast/expression directory
rg -l "class Field" --type=javaRepository: opensearch-project/sql
Length of output: 946
🏁 Script executed:
# Check the determineTargetField method and surrounding context in AstBuilder.java
rg -A 20 "determineTargetField" ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java | head -50Repository: opensearch-project/sql
Length of output: 1329
🏁 Script executed:
# Get the full context around lines 1225-1229 in AstBuilder.java
sed -n '1200,1250p' ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.javaRepository: opensearch-project/sql
Length of output: 2124
🏁 Script executed:
# Read the Field class definition
cat -n ./core/src/main/java/org/opensearch/sql/ast/expression/Field.java | head -100Repository: opensearch-project/sql
Length of output: 1572
🏁 Script executed:
# Check for Lombok annotations and full class structure
rg -A 50 "public class Field" ./core/src/main/java/org/opensearch/sql/ast/expression/Field.javaRepository: opensearch-project/sql
Length of output: 813
🏁 Script executed:
# Look for QualifiedName class to understand what targetField.getField() is
find . -type f -name "QualifiedName.java" | grep -v legacyRepository: opensearch-project/sql
Length of output: 139
🏁 Script executed:
# Read QualifiedName class
cat -n ./core/src/main/java/org/opensearch/sql/ast/expression/QualifiedName.java | head -80Repository: opensearch-project/sql
Length of output: 2839
🏁 Script executed:
# Check what toString() implementations exist in QualifiedName
rg -A 10 "toString()" ./core/src/main/java/org/opensearch/sql/ast/expression/QualifiedName.javaRepository: opensearch-project/sql
Length of output: 332
🏁 Script executed:
# Get more context around line 1225 to understand fieldArg type better
sed -n '1205,1235p' ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.javaRepository: opensearch-project/sql
Length of output: 1256
Use funcCtx.alias == null to detect none no-op instead of fragile toString() comparison
The Field class uses Lombok's @ToString annotation, which generates a representation like Field(field=..., fieldArgs=[...]). When fieldArg is a Field and targetField.getField() is a QualifiedName (which has a custom toString() returning the plain field name), these string representations are fundamentally different and will never be equal. This means the null branch is never taken, and none(field) without an alias always produces a redundant new Let(field, field) instead of being filtered out.
Since determineTargetField returns the same field object when no alias is provided, the correct no-op condition is simply:
🛠️ Proposed fix
if ("none".equalsIgnoreCase(functionName)) {
- return fieldArg.toString().equals(targetField.getField().toString())
- ? null
- : new Let(targetField, fieldArg);
+ // When no alias, none(field) is a no-op (source == target); drop it.
+ return funcCtx.alias == null ? null : new Let(targetField, fieldArg);
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java` around lines
1225 - 1229, The current check for the "none" no-op uses toString() comparison
between fieldArg and targetField.getField(), which is fragile; instead detect
the no-op by checking the function context alias (funcCtx.alias == null) because
determineTargetField returns the same field when no alias is provided. Replace
the ternary that returns null or new Let(targetField, fieldArg) with a condition
that if "none".equalsIgnoreCase(functionName) and funcCtx.alias == null then
return null, otherwise return new Let(targetField, fieldArg); update the block
referencing functionName, fieldArg, targetField (and Let) accordingly so the
redundant new Let(field, field) is not created when alias is absent.
| @Override | ||
| public String visitConvert(Convert node, String context) { | ||
| String child = node.getChild().get(0).accept(this, context); | ||
| String conversions = | ||
| node.getConversions().stream() | ||
| .map( | ||
| conversion -> { | ||
| String functionName = ""; | ||
| String fields = MASK_COLUMN; | ||
| String actualSourceField = ""; | ||
|
|
||
| if (conversion.getExpression() instanceof Function) { | ||
| Function func = (Function) conversion.getExpression(); | ||
| functionName = func.getFuncName().toLowerCase(Locale.ROOT); | ||
| if (!func.getFuncArgs().isEmpty() | ||
| && func.getFuncArgs().get(0) instanceof Field) { | ||
| actualSourceField = ((Field) func.getFuncArgs().get(0)).getField().toString(); | ||
| } | ||
| fields = | ||
| func.getFuncArgs().stream() | ||
| .map(arg -> MASK_COLUMN) | ||
| .collect(Collectors.joining(",")); | ||
| } | ||
|
|
||
| String targetField = conversion.getVar().getField().toString(); | ||
|
|
||
| String asClause = | ||
| !targetField.equals(actualSourceField) ? " AS " + MASK_COLUMN : ""; | ||
| return StringUtils.format("%s(%s)%s", functionName, fields, asClause); | ||
| }) | ||
| .collect(Collectors.joining(",")); | ||
| return StringUtils.format("%s | convert %s", child, conversions); | ||
| } |
There was a problem hiding this comment.
none(field) AS alias conversions produce an empty function name in the anonymized output
When buildConversion in AstBuilder handles a none function with an alias, it stores the raw fieldArg (Field) — not a Function — as the Let expression. Here in visitConvert, the instanceof Function guard is never satisfied for that case, so functionName remains "" and the output is "(***) AS ***" instead of "none(***) AS ***".
🛠️ Suggested fix in the anonymizer
if (conversion.getExpression() instanceof Function) {
Function func = (Function) conversion.getExpression();
functionName = func.getFuncName().toLowerCase(Locale.ROOT);
if (!func.getFuncArgs().isEmpty()
&& func.getFuncArgs().get(0) instanceof Field) {
actualSourceField = ((Field) func.getFuncArgs().get(0)).getField().toString();
}
fields =
func.getFuncArgs().stream()
.map(arg -> MASK_COLUMN)
.collect(Collectors.joining(","));
+ } else if (conversion.getExpression() instanceof Field sourceField) {
+ // none(field) AS alias — expression is the raw field, function name is "none"
+ functionName = "none";
+ actualSourceField = sourceField.getField().toString();
}Alternatively, in AstBuilder.buildConversion, always wrap the none-with-alias case in AstDSL.function("none", fieldArg) (matching the other functions), so the anonymizer sees a uniform Function node.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java`
around lines 463 - 495, In visitConvert, handle the case where
conversion.getExpression() is a Field (the "none(field) AS alias" case) so
functionName isn’t left empty: if conversion.getExpression() is a Field, set
functionName = "none", set actualSourceField to that Field's string, and set
fields to MASK_COLUMN (so the resulting format becomes "none(***) AS ***"); keep
the existing logic for Function nodes unchanged and preserve the existing
asClause/targetField logic that compares targetField to actualSourceField.
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
| } | ||
|
|
||
| public static Object convert(Object value) { | ||
| return new AutoConvertFunction().convertValue(value); |
There was a problem hiding this comment.
Each static convert() method creates a new UDF instance per row (new AutoConvertFunction().convertValue(value)). Additionally, AutoConvertFunction.applyConversion creates new NumConvertFunction() per row at line 26. For large datasets this produces 2 object allocations per row per convert call. Since convertValue and applyConversion are stateless, use a cached singleton instance instead (e.g. private static final AutoConvertFunction INSTANCE = new AutoConvertFunction()). Same pattern applies to all 5 UDF classes.
There was a problem hiding this comment.
Updated to use cached singleton instance instead of creating new instance every row for the 5 UDF classes
| Pattern.compile("^([+-]?(?:\\d+\\.?\\d*|\\.\\d+)(?:[eE][+-]?\\d+)?)(.*)$"); | ||
| private static final Pattern CONTAINS_LETTER_PATTERN = Pattern.compile(".*[a-zA-Z].*"); | ||
| private static final Pattern STARTS_WITH_SIGN_OR_DIGIT = Pattern.compile("^[+-]?[\\d.].*"); | ||
| private static final Pattern MEMK_PATTERN = Pattern.compile("^([+-]?\\d+\\.?\\d*)([kmgKMG])?$"); |
There was a problem hiding this comment.
memk("512 m") (with space before unit) returns null. Is this the intended Splunk compatible behavior?
There was a problem hiding this comment.
Yes, Splunk also returns null when there is a space between the value and unit
| throw new IllegalArgumentException("Convert function requires a field argument"); | ||
| } | ||
|
|
||
| String functionName = funcCtx.functionName.getText(); |
There was a problem hiding this comment.
convert currently accepts any identifier as a function name and forwards it without semantic validation, so unsupported/non-convert functions (for example convert lower(field)) are accepted even though the command contract documents a fixed set (auto/num/rmcomma/rmunit/memk/none). This makes convert behave like a generic single-arg eval and can silently produce non-numeric output types, so function names should be rejected early unless they are in the supported conversion list.
There was a problem hiding this comment.
Updated to only accept supported convert functions
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
There was a problem hiding this comment.
♻️ Duplicate comments (5)
core/src/main/java/org/opensearch/sql/expression/function/udf/MemkConvertFunction.java (1)
13-18:⚠️ Potential issue | 🟡 MinorAdd JavaDoc for the public constructor and convert entrypoint.
These public methods still lack JavaDoc with
@param,@return, and@throws.
As per coding guidelines: "core/src/main/java/**/*.java: Public methods MUST have JavaDoc with@param,@return, and@throws".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/org/opensearch/sql/expression/function/udf/MemkConvertFunction.java` around lines 13 - 18, Add JavaDoc for the public constructor MemkConvertFunction() and the public static entrypoint convert(Object value): document the constructor and the convert method with a brief description, include `@param` for "value", `@return` describing the returned Object (and its semantics), and `@throws` for any exceptions the method may propagate (e.g., IllegalArgumentException or any specific runtime exceptions thrown by INSTANCE.convertValue). Place the JavaDoc immediately above the constructor and the convert method declarations so they comply with the project's "Public methods MUST have JavaDoc" rule.ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java (2)
1225-1244:⚠️ Potential issue | 🟡 MinorNormalize conversion function names before building the function node.
You validate using
toLowerCase()but pass the rawfunctionNameintoAstDSL.function(...). If resolution is case-sensitive,CONVERT AUTO(field)could fail or diverge from registered operator naming. Normalize once and use the normalized name in the AST.🛠️ Suggested fix
- String functionName = funcCtx.functionName.getText(); + String functionName = funcCtx.functionName.getText(); + String normalizedFunctionName = functionName.toLowerCase(Locale.ROOT); ... - if (!SUPPORTED_CONVERSION_FUNCTIONS.contains(functionName.toLowerCase())) { + if (!SUPPORTED_CONVERSION_FUNCTIONS.contains(normalizedFunctionName)) { throw new IllegalArgumentException( String.format( "Unsupported conversion function '%s'. Supported functions are: %s", functionName, SUPPORTED_CONVERSION_FUNCTIONS)); } ... - return new Let(targetField, AstDSL.function(functionName, fieldArg)); + return new Let(targetField, AstDSL.function(normalizedFunctionName, fieldArg));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java` around lines 1225 - 1244, The code validates functionName using toLowerCase() against SUPPORTED_CONVERSION_FUNCTIONS but still uses the raw functionName when building the AST; normalize the name once (e.g., lowercased normalizedName) after reading funcCtx.functionName.getText(), use normalizedName for the contains check, for the "none" branch comparison, and pass normalizedName into AstDSL.function(...) when constructing the Let node (leave usage of internalVisitExpression and determineTargetField unchanged).
1238-1242:⚠️ Potential issue | 🟡 MinorFix
noneno-op detection to avoid redundantLet.The
toString()comparison betweenfieldArgandtargetField.getField()is fragile and will never match forFieldvsQualifiedName. This makesnone(field)always emitnew Let(field, field)instead of being a no-op. Use the alias presence to detect no-op instead.🛠️ Suggested fix
- if ("none".equalsIgnoreCase(functionName)) { - return fieldArg.toString().equals(targetField.getField().toString()) - ? null - : new Let(targetField, fieldArg); - } + if ("none".equalsIgnoreCase(functionName)) { + // When no alias is provided, none(field) is a no-op. + return funcCtx.alias == null ? null : new Let(targetField, fieldArg); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java` around lines 1238 - 1242, The current "none" branch in AstBuilder uses a fragile toString() comparison between fieldArg and targetField.getField() and thus emits new Let(...) even for no-ops; change the check to detect a true no-op by inspecting fieldArg's alias instead: in the if ("none".equalsIgnoreCase(functionName)) branch, return null when fieldArg has an alias that matches the target field (i.e., detect alias presence on fieldArg and compare it to the targetField.getField() identity/name) and only create new Let(targetField, fieldArg) when that alias-based no-op check fails.core/src/main/java/org/opensearch/sql/expression/function/udf/AutoConvertFunction.java (1)
13-18:⚠️ Potential issue | 🟡 MinorAdd JavaDoc for the public constructor and convert entrypoint.
These public methods still lack JavaDoc with
@param,@return, and@throws.
As per coding guidelines: "core/src/main/java/**/*.java: Public methods MUST have JavaDoc with@param,@return, and@throws".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/org/opensearch/sql/expression/function/udf/AutoConvertFunction.java` around lines 13 - 18, Add JavaDoc comments for the public constructor AutoConvertFunction() and the public static entrypoint convert(Object value): document the parameters with `@param`, the return value with `@return`, and any possible exceptions thrown with `@throws` (e.g., if convertValue can throw IllegalArgumentException or other runtime exceptions) following the project's JavaDoc guidelines; place the JavaDoc immediately above the AutoConvertFunction() constructor and above the convert(Object) method and ensure wording describes purpose, parameter semantics, return type, and conditions that lead to exceptions.core/src/main/java/org/opensearch/sql/expression/function/udf/RmcommaConvertFunction.java (1)
13-18:⚠️ Potential issue | 🟡 MinorAdd JavaDoc for the public constructor and convert entrypoint.
These public methods still lack JavaDoc with
@param,@return, and@throws.
As per coding guidelines: "core/src/main/java/**/*.java: Public methods MUST have JavaDoc with@param,@return, and@throws".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@core/src/main/java/org/opensearch/sql/expression/function/udf/RmcommaConvertFunction.java` around lines 13 - 18, Add JavaDoc for the public RmcommaConvertFunction() constructor and the public static convert(Object value) entrypoint: document parameters with `@param` (describe "value"), describe the return value with `@return`, and declare any runtime exceptions thrown (or document that none are thrown) with `@throws`; reference the class RmcommaConvertFunction, the constructor RmcommaConvertFunction(), and the convert(Object) method (which delegates to INSTANCE.convertValue) so future readers know the delegation to INSTANCE.convertValue and any exceptions it may propagate.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/AutoConvertFunction.java`:
- Around line 13-18: Add JavaDoc comments for the public constructor
AutoConvertFunction() and the public static entrypoint convert(Object value):
document the parameters with `@param`, the return value with `@return`, and any
possible exceptions thrown with `@throws` (e.g., if convertValue can throw
IllegalArgumentException or other runtime exceptions) following the project's
JavaDoc guidelines; place the JavaDoc immediately above the
AutoConvertFunction() constructor and above the convert(Object) method and
ensure wording describes purpose, parameter semantics, return type, and
conditions that lead to exceptions.
In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/MemkConvertFunction.java`:
- Around line 13-18: Add JavaDoc for the public constructor
MemkConvertFunction() and the public static entrypoint convert(Object value):
document the constructor and the convert method with a brief description,
include `@param` for "value", `@return` describing the returned Object (and its
semantics), and `@throws` for any exceptions the method may propagate (e.g.,
IllegalArgumentException or any specific runtime exceptions thrown by
INSTANCE.convertValue). Place the JavaDoc immediately above the constructor and
the convert method declarations so they comply with the project's "Public
methods MUST have JavaDoc" rule.
In
`@core/src/main/java/org/opensearch/sql/expression/function/udf/RmcommaConvertFunction.java`:
- Around line 13-18: Add JavaDoc for the public RmcommaConvertFunction()
constructor and the public static convert(Object value) entrypoint: document
parameters with `@param` (describe "value"), describe the return value with
`@return`, and declare any runtime exceptions thrown (or document that none are
thrown) with `@throws`; reference the class RmcommaConvertFunction, the
constructor RmcommaConvertFunction(), and the convert(Object) method (which
delegates to INSTANCE.convertValue) so future readers know the delegation to
INSTANCE.convertValue and any exceptions it may propagate.
In `@ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java`:
- Around line 1225-1244: The code validates functionName using toLowerCase()
against SUPPORTED_CONVERSION_FUNCTIONS but still uses the raw functionName when
building the AST; normalize the name once (e.g., lowercased normalizedName)
after reading funcCtx.functionName.getText(), use normalizedName for the
contains check, for the "none" branch comparison, and pass normalizedName into
AstDSL.function(...) when constructing the Let node (leave usage of
internalVisitExpression and determineTargetField unchanged).
- Around line 1238-1242: The current "none" branch in AstBuilder uses a fragile
toString() comparison between fieldArg and targetField.getField() and thus emits
new Let(...) even for no-ops; change the check to detect a true no-op by
inspecting fieldArg's alias instead: in the if
("none".equalsIgnoreCase(functionName)) branch, return null when fieldArg has an
alias that matches the target field (i.e., detect alias presence on fieldArg and
compare it to the targetField.getField() identity/name) and only create new
Let(targetField, fieldArg) when that alias-based no-op check fails.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
core/src/main/java/org/opensearch/sql/expression/function/udf/AutoConvertFunction.javacore/src/main/java/org/opensearch/sql/expression/function/udf/MemkConvertFunction.javacore/src/main/java/org/opensearch/sql/expression/function/udf/NumConvertFunction.javacore/src/main/java/org/opensearch/sql/expression/function/udf/RmcommaConvertFunction.javacore/src/main/java/org/opensearch/sql/expression/function/udf/RmunitConvertFunction.javadocs/user/ppl/cmd/convert.mdppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
🚧 Files skipped from review as they are similar to previous changes (3)
- core/src/main/java/org/opensearch/sql/expression/function/udf/NumConvertFunction.java
- docs/user/ppl/cmd/convert.md
- core/src/main/java/org/opensearch/sql/expression/function/udf/RmunitConvertFunction.java
penghuo
left a comment
There was a problem hiding this comment.
Thanks!
Please fix merge conflict.
Signed-off-by: ritvibhatt <53196324+ritvibhatt@users.noreply.github.com>
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
Description
Copy of #5034, resolves merge conflicts
Related Issues
Resolves partially (only numeric and memory conversions in this PR) #5001
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.