Skip to content

Conversation

@XuQianJin-Stars
Copy link
Contributor

@XuQianJin-Stars XuQianJin-Stars commented Jan 5, 2026

Purpose

Linked issue: close #2257

This PR adds support for MAP data types when tiering data from Fluss to Apache Paimon. Previously, MAP types were not properly handled during the tiering process, which limited the ability to tier tables containing MAP columns to Paimon.

Brief change log

  • Added FlussMapAsPaimonMap adapter class to convert Fluss InternalMap to Paimon InternalMap format
  • Added PaimonMapAsFlussMap adapter class to wrap Paimon InternalMap as Fluss InternalMap for reading
  • Enhanced FlussRowAsPaimonRow to support getMap() method for handling MAP type fields by wrapping Fluss maps with proper key and value type information
  • Enhanced FlussArrayAsPaimonArray to support nested MAP types within arrays
  • Enhanced PaimonRowAsFlussRow and PaimonArrayAsFlussArray to support bidirectional MAP type conversion
  • Updated documentation to include MAP<kt, vt> type mapping in the Paimon data type mapping table

Tests

  • FlussRecordAsPaimonRowTest.testMapTypeWithIntegerKeyValue: Verifies basic MAP type with integer keys and values
  • FlussRecordAsPaimonRowTest.testMapTypeWithStringKeyIntValue: Verifies MAP type with string keys and integer values
  • FlussRecordAsPaimonRowTest.testNestedMapType: Verifies nested MAP types (MAP containing MAP as values)
  • FlussRecordAsPaimonRowTest.testMapWithComplexTypes: Verifies MAP type with complex value types like DECIMAL
  • FlussRowAsPaimonRowTest: Contains additional test cases for MAP type handling in row conversion

API and Format

This change does not affect API or storage format. It enables an existing data type (MAP) to be properly handled during the tiering process to Paimon.

Documentation

The documentation has been updated to include MAP<kt, vt> in the Paimon data type mapping table.

Copy link
Contributor

@luoyuxia luoyuxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@XuQianJin-Stars Thanks for the pr. Please also add a IT to make sure
tiering and union read works for map type just like we did for array & nest row type

}

@Test
void testMapTypeWithIntegerKeyValue() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these test method be put into one single test method?

}

@Test
void testMapTypeWithIntegerKeyValue() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dito
Can these test methods put into one single test method

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for MAP data types when tiering data from Fluss to Apache Paimon. Previously, MAP types threw UnsupportedOperationException, limiting the ability to tier tables with MAP columns. The implementation follows the existing pattern used for ARRAY types and includes bidirectional conversion support.

Key Changes:

  • Added FlussMapAsPaimonMap adapter class to convert Fluss InternalMap to Paimon InternalMap format
  • Enhanced FlussRowAsPaimonRow and FlussArrayAsPaimonArray to support MAP type conversion with proper type information propagation
  • Added comprehensive test coverage including nested maps, null handling, empty maps, and maps within arrays

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
fluss-lake/fluss-lake-paimon/src/main/java/org/apache/fluss/lake/paimon/source/FlussMapAsPaimonMap.java New adapter class that implements Paimon's InternalMap interface by wrapping Fluss InternalMap and converting key/value arrays with proper type information
fluss-lake/fluss-lake-paimon/src/main/java/org/apache/fluss/lake/paimon/source/FlussRowAsPaimonRow.java Replaced UnsupportedOperationException with MAP type conversion logic that extracts type information and creates FlussMapAsPaimonMap adapter
fluss-lake/fluss-lake-paimon/src/main/java/org/apache/fluss/lake/paimon/source/FlussArrayAsPaimonArray.java Added MAP type support for arrays containing maps, enabling nested MAP types within ARRAY columns
fluss-lake/fluss-lake-paimon/src/test/java/org/apache/fluss/lake/paimon/tiering/FlussRecordAsPaimonRowTest.java Added 9 comprehensive test cases covering integer/string keys, nested maps, null handling, empty maps, nullable values, maps in arrays, and complex value types
fluss-lake/fluss-lake-paimon/src/test/java/org/apache/fluss/lake/paimon/source/FlussRowAsPaimonRowTest.java Added 4 test cases for row-level MAP type conversion including nested maps and maps within arrays

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 831 to 865
void testMapTypeWithIntegerKeyValue() {
int tableBucket = 0;
RowType tableRowType =
RowType.of(
new org.apache.paimon.types.MapType(
new org.apache.paimon.types.IntType(),
new org.apache.paimon.types.IntType()),
// system columns
new org.apache.paimon.types.IntType(),
new org.apache.paimon.types.BigIntType(),
new org.apache.paimon.types.LocalZonedTimestampType(3));

FlussRecordAsPaimonRow flussRecordAsPaimonRow =
new FlussRecordAsPaimonRow(tableBucket, tableRowType);
long logOffset = 0;
long timeStamp = System.currentTimeMillis();
GenericRow genericRow = new GenericRow(1);
Map<Object, Object> mapData = new HashMap<>();
mapData.put(1, 100);
mapData.put(2, 200);
mapData.put(3, 300);
genericRow.setField(0, new GenericMap(mapData));

LogRecord logRecord = new GenericRecord(logOffset, timeStamp, APPEND_ONLY, genericRow);
flussRecordAsPaimonRow.setFlussRecord(logRecord);

InternalMap map = flussRecordAsPaimonRow.getMap(0);
assertThat(map).isNotNull();
assertThat(map.size()).isEqualTo(3);

InternalArray keys = map.keyArray();
InternalArray values = map.valueArray();
assertThat(keys.size()).isEqualTo(3);
assertThat(values.size()).isEqualTo(3);
}
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test methods testMapTypeWithIntegerKeyValue, testMapTypeWithStringKeyIntValue, testNestedMapType, testMapWithComplexTypes, testMapInArray cover the basic functionality but don't verify the actual key and value contents retrieved from the InternalMap. The tests only assert the size of keys and values arrays but don't check if the actual data is correctly converted. Consider adding assertions to verify the actual key-value pairs are correctly accessible through the Paimon InternalMap interface, for example by checking specific key and value retrievals like keys.getInt(0), values.getInt(0), etc.

Copilot uses AI. Check for mistakes.
Comment on lines 342 to 370
void testMapTypeWithIntegerKeyValue() {
RowType tableRowType =
RowType.of(
new org.apache.paimon.types.MapType(
new org.apache.paimon.types.IntType(),
new org.apache.paimon.types.IntType()));

long logOffset = 0;
long timeStamp = System.currentTimeMillis();
GenericRow genericRow = new GenericRow(1);
Map<Object, Object> mapData = new HashMap<>();
mapData.put(1, 100);
mapData.put(2, 200);
mapData.put(3, 300);
genericRow.setField(0, new GenericMap(mapData));

LogRecord logRecord = new GenericRecord(logOffset, timeStamp, APPEND_ONLY, genericRow);
FlussRowAsPaimonRow flussRowAsPaimonRow =
new FlussRowAsPaimonRow(logRecord.getRow(), tableRowType);

InternalMap map = flussRowAsPaimonRow.getMap(0);
assertThat(map).isNotNull();
assertThat(map.size()).isEqualTo(3);

InternalArray keys = map.keyArray();
InternalArray values = map.valueArray();
assertThat(keys.size()).isEqualTo(3);
assertThat(values.size()).isEqualTo(3);
}
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the tiering tests, these test methods only verify the size of the maps but don't assert the actual content of the keys and values. For example, in testMapTypeWithIntegerKeyValue, after getting the keys and values arrays, the test should verify that the arrays contain the expected data (e.g., keys contain 1, 2, 3 and values contain 100, 200, 300). This would ensure the data conversion is working correctly end-to-end.

Copilot uses AI. Check for mistakes.
}

@Test
void testNestedMapType() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this test method be moved to testMapWithAllTypes?

}

@Test
void testMapInArray() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this test method be removed since testMapWithAllTypes already verify this case?

}

@Test
void testNestedMapType() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to move to testMapWithAllTypes?

@XuQianJin-Stars
Copy link
Contributor Author

@luoyuxia Hi, i already updated the pr. Please help review when you got some time.

Copy link
Contributor

@luoyuxia luoyuxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@XuQianJin-Stars Thanks. LGTM!

@luoyuxia luoyuxia merged commit 835365d into apache:main Jan 7, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support tier Map type for paimon

2 participants