feat: introduce type casting executors for schema evolution#100
Open
lszskye wants to merge 1 commit into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Introduce Type Casting Executors for Schema Evolution
Summary
Add a set of
CastExecutorimplementations undersrc/paimon/core/casting/to support type casting between boolean, numeric, decimal, and string types. These executors are essential for schema evolution scenarios where column types change across table versions, enabling predicate pushdown and data reading to work correctly across different schemas.Each executor supports two casting modes:
Literalvalue to the target type (used during predicate rewriting).Arrayto the target type (used during batch data reading).New Classes
BooleanToDecimalCastExecutor— Casts boolean values to decimal types. ConvertstruetoDecimal128(1)andfalsetoDecimal128(0)with the specified target precision and scale.BooleanToNumericCastExecutor— Casts boolean values to numeric primitive types (int8, int16, int32, int64, float, double). Uses a dispatch map to route casting logic per targetFieldType, mappingtrueto1andfalseto0.BooleanToStringCastExecutor— Casts boolean values to string type. Convertstrueto"true"andfalseto"false".DecimalToDecimalCastExecutor— Casts decimal values between different precision/scale decimal types. Handles rescaling by adjusting the underlyingDecimal128representation to match the target precision and scale, with overflow detection.DecimalToNumericPrimitiveCastExecutor— Casts decimal values to numeric primitive types. Truncates the decimal value to the target integer or floating-point type by dividing out the scale factor.NumericPrimitiveToDecimalCastExecutor— Casts numeric primitive values (int8 through double) to decimal types. Employs templated dispatch to handle each source type, scaling the input value to the target decimal precision and scale.NumericToBooleanCastExecutor— Casts numeric values to boolean type. Uses a dispatch map per sourceFieldType, converting zero values tofalseand non-zero values totrue.NumericToStringCastExecutor— Casts numeric values to string type. Uses a dispatch map per sourceFieldType, converting each numeric value to its string representation.StringToBooleanCastExecutor— Casts string values to boolean type. Recognizes"true"/"TRUE"astrueand"false"/"FALSE"asfalse, returning an error for unrecognized string values.StringToDecimalCastExecutor— Casts string values to decimal types. Parses the string into anarrow::Decimal128value with the specified target precision and scale, with support for null propagation when parsing fails.StringToNumericPrimitiveCastExecutor— Casts string values to numeric primitive types (int8 through double). Uses a templated dispatch map per targetFieldType, parsing the string representation into the corresponding numeric value.