Skip to content

Optimize BinaryPlainValuesReader by reading directly from ByteBuffer #3509

@iemejia

Description

@iemejia

Describe the enhancement requested

BinaryPlainValuesReader.readBytes() is the hot-path decoder for BINARY (and STRING) columns using PLAIN encoding. The current implementation funnels every length read through BytesUtils.readIntLittleEndian(InputStream) and every value slice through ByteBufferInputStream.slice(int):

public Binary readBytes() {
  try {
    int length = BytesUtils.readIntLittleEndian(in);
    return Binary.fromConstantByteBuffer(in.slice(length));
  } catch (IOException | RuntimeException e) {
    throw new ParquetDecodingException("could not read bytes at offset " + in.position(), e);
  }
}

Two issues per value:

  1. BytesUtils.readIntLittleEndian(InputStream) calls in.read() four times. Each call goes through a try / IOException plumbing path and a virtual dispatch on ByteBufferInputStream (typically resolved as either SingleBufferInputStream or MultiBufferInputStream).
  2. in.slice(length) is also a virtual dispatch on ByteBufferInputStream for every value.

If the page is materialised as a MultiBufferInputStream the cost is even higher because each slice may have to walk a buffer list.

JMH (BinaryEncodingBenchmark.decodePlain, 100k values per invocation, JDK 18, -wi 5 -i 10 -f 3, 30 samples) on master:

cardinality stringLength ops/s
HIGH 10 23.11M
HIGH 100 20.52M
HIGH 1000 7.07M
LOW 10 22.89M
LOW 100 20.35M
LOW 1000 6.28M

Proposal

Replace the ByteBufferInputStream field with a single ByteBuffer set up once in initFromPage:

@Override
public void initFromPage(int valueCount, ByteBufferInputStream stream) throws IOException {
  int available = stream.available();
  this.buffer = available > 0
      ? stream.slice(available).order(ByteOrder.LITTLE_ENDIAN)
      : ByteBuffer.allocate(0).order(ByteOrder.LITTLE_ENDIAN);
}

@Override
public Binary readBytes() {
  int length = buffer.getInt();
  ByteBuffer valueSlice = buffer.slice();
  valueSlice.limit(length);
  buffer.position(buffer.position() + length);
  return Binary.fromConstantByteBuffer(valueSlice);
}

The length prefix is now a single ByteBuffer.getInt() (one bounds check, no IOException plumbing, JIT-friendly intrinsic on little-endian buffers) and each value slice is a direct ByteBuffer.slice() instead of a virtual ByteBufferInputStream.slice(int).

The trade-off: when the input is a MultiBufferInputStream the upfront stream.slice(available) call may consolidate the page into a single fresh ByteBuffer. This is one allocation per page in exchange for inlined per-value reads, which is a clear win whenever the page contains more than a handful of values.

Expected speedup (same JMH config):

cardinality stringLength Before After Δ
HIGH 10 23.11M 27.13M +17.4% (1.17x)
HIGH 100 20.52M 22.20M +8.2% (1.08x)
HIGH 1000 7.07M 7.68M +8.6% (1.09x)
LOW 10 22.89M 26.46M +15.6% (1.16x)
LOW 100 20.35M 22.16M +8.9% (1.09x)
LOW 1000 6.28M 7.50M +19.4% (1.19x)

Allocation per op is unchanged (~88 B/op = the returned Binary + the per-value ByteBuffer slice).

The improvement is largest at small string lengths because the per-value fixed cost (length read + slice) dominates more there; at 1000-byte values the cost is increasingly dominated by the value-bytes copy/compare downstream rather than the read itself, but the gain is still ~9–19% even there.

Scope

  • Single file change to parquet-column/src/main/java/org/apache/parquet/column/values/plain/BinaryPlainValuesReader.java.
  • No public-API change; only the implementation of readBytes(), skip(), and initFromPage() is rewritten.
  • All 573 parquet-column tests pass.

Relation

Part of a small series of focused performance PRs from work in parquet-perf. Previous: #3494 (PlainValuesReader), #3496 (PlainValuesWriter), #3500 (Binary.hashCode cache), #3504 (BSS writer), #3506 (BSS reader).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions