Skip to content

Core: Implement LZ4 frame compression for Puffin format#16054

Draft
laserninja wants to merge 1 commit intoapache:mainfrom
laserninja:fix/16033-puffin-lz4-compression
Draft

Core: Implement LZ4 frame compression for Puffin format#16054
laserninja wants to merge 1 commit intoapache:mainfrom
laserninja:fix/16033-puffin-lz4-compression

Conversation

@laserninja
Copy link
Copy Markdown

What

Implements LZ4 frame compression and decompression in PuffinFormat, fixing the UnsupportedOperationException thrown when attempting to use LZ4 compression (which is the default for Puffin footer compression).

Why

The compress() and decompress() methods in PuffinFormat had TODO stubs for the LZ4 case that fell through to throw new UnsupportedOperationException("Unsupported codec: lz4"). This made footer compression — which defaults to LZ4 per the Puffin spec — unusable at runtime.

How

  • Added lz4-java (net.jpountz.lz4) as a dependency to iceberg-core. The library was already defined in the version catalog (gradle/libs.versions.toml) but was not wired as a dependency.
  • Implemented compressLz4() using LZ4FrameOutputStream with CONTENT_SIZE and BLOCK_INDEPENDENCE flags, conforming to the Puffin spec requirement of "LZ4 single compression frame with content size present".
  • Implemented decompressLz4() using LZ4FrameInputStream.
  • The existing aircompressor library only provides block-level LZ4, not the frame-level compression required by the Puffin spec. lz4-java provides the necessary frame format support.

Testing

  • Added round-trip tests in TestPuffinFormat for both non-empty and empty data.
  • Updated testEmptyFooterCompressed in TestPuffinWriter — previously asserted UnsupportedOperationException, now verifies successful LZ4 footer compression and read-back.
  • Added testWriteAndReadMetricDataCompressedLz4 in TestPuffinWriter for full write + read verification with LZ4-compressed blobs.

All existing Puffin tests continue to pass.

Fixes #16033

Implement LZ4 frame compression and decompression in PuffinFormat
using lz4-java (net.jpountz.lz4), which was already defined in the
version catalog but not wired as a dependency of iceberg-core.

The Puffin spec requires LZ4 single compression frame with content
size present. The existing aircompressor library only provides
block-level LZ4, not frame-level, so lz4-java's LZ4FrameOutputStream
and LZ4FrameInputStream are used instead.

Previously, compress() and decompress() had TODO stubs for LZ4 that
threw UnsupportedOperationException at runtime, making footer
compression (which defaults to LZ4) unusable.

Fixes apache#16033
@laserninja laserninja marked this pull request as draft April 20, 2026 05:30
@nastra nastra self-requested a review April 21, 2026 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Core: Puffin LZ4 footer compression throws UnsupportedOperationException at runtime

2 participants