feat: introduce type casting executors for Binary, Date, and Timestam…#101
Open
lszskye wants to merge 1 commit into
Open
feat: introduce type casting executors for Binary, Date, and Timestam…#101lszskye wants to merge 1 commit into
lszskye wants to merge 1 commit into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Introduce Type Casting Executors for Binary, Date, and Timestamp Types
Summary
Add a set of
CastExecutorimplementations undersrc/paimon/core/casting/to support type casting between binary, date, timestamp, string, and numeric types. These executors extend the casting framework to cover temporal and binary type conversions required for schema evolution scenarios.Each executor supports two casting modes:
Literalvalue to the target type (used during predicate rewriting).Arrayto the target type (used during batch data reading).New Classes
BinaryToStringCastExecutor— Casts binary (byte array) values to string type by interpreting the raw bytes as a UTF-8 encoded string.DateToStringCastExecutor— Casts date values (days since epoch) to their ISO-8601 string representation (e.g.,"2024-01-15").DateToTimestampCastExecutor— Casts date values to timestamp types with configurable time unit (seconds, milliseconds, microseconds, nanoseconds). Sets the time component to midnight (00:00:00). Note: C++ Paimon supports a narrower date range than Java Paimon when the target unit is nanoseconds, limited by int64 bounds.NumericPrimitiveToTimestampCastExecutor— Casts numeric primitive values (interpreted as seconds since epoch) to timestamp types with the specified precision. Note: When the target type is nanosecond precision, the valid numeric range is narrower than Java Paimon due to int64 overflow constraints.StringToBinaryCastExecutor— Casts string values to binary type by encoding the string content as raw bytes.StringToDateCastExecutor— Casts string values in date format (e.g.,"2024-01-15") to date type (days since epoch).StringToTimestampCastExecutor— Casts string values in timestamp format to timestamp types. Supports ISO-8601 formats including"1970-01-01T00:00:00". Note: Differs from Java Paimon in that it does not support passing numeric values as timestamp strings, but additionally supports theT-separated format.TimestampToDateCastExecutor— Casts timestamp values to date type by truncating the time component and retaining only the date part (days since epoch).TimestampToNumericPrimitiveCastExecutor— Casts timestamp values to numeric primitive types by extracting the epoch seconds. When the source timestamp has sub-second precision, the fractional part is truncated.TimestampToStringCastExecutor— Casts timestamp values to their ISO-8601 string representation with appropriate precision (e.g.,"2024-01-15 12:30:45.123").TimestampToTimestampCastExecutor— Converts betweenTIMESTAMPandTIMESTAMP_WITH_LOCAL_TIME_ZONEtypes, handling precision changes (e.g., milliseconds to nanoseconds) by scaling the underlying int64 value accordingly.