Skip to content

[WIP] OPEN_STRUCT storage layer — columnar two-tier dense/sparse index (PR 2/4)#18643

Draft
tarun11Mavani wants to merge 12 commits into
apache:masterfrom
tarun11Mavani:open-struct-storage
Draft

[WIP] OPEN_STRUCT storage layer — columnar two-tier dense/sparse index (PR 2/4)#18643
tarun11Mavani wants to merge 12 commits into
apache:masterfrom
tarun11Mavani:open-struct-storage

Conversation

@tarun11Mavani
Copy link
Copy Markdown
Contributor

Instructions:

  1. The PR has to be tagged with at least one of the following labels (*):
    1. feature
    2. bugfix
    3. performance
    4. ui
    5. backward-incompat
    6. release-notes (**)
  2. Remove these instructions before publishing the PR.

(*) Other labels to consider:

  • testing
  • dependencies
  • docker
  • kubernetes
  • observability
  • security
  • code-style
  • extension-point
  • refactor
  • cleanup

(**) Use release-notes label for scenarios like:

  • New configuration options
  • Deprecation of configurations
  • Signature changes to public methods/interfaces
  • New plugins added or old plugins removed

@tarun11Mavani tarun11Mavani force-pushed the open-struct-storage branch from 7f254e4 to d0e436f Compare June 1, 2026 05:26
_dataSources =
new Object2ObjectOpenHashMap<>(segmentMetadata.getColumnMetadataMap().size());

Map<String, Map<String, DataSource>> openStructDenseChildren = new HashMap<>();
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Map.of() for all

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jun 1, 2026

Codecov Report

❌ Patch coverage is 67.98732% with 202 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.44%. Comparing base (30b3b31) to head (8fe4042).
⚠️ Report is 5 commits behind head on master.

Files with missing lines Patch % Lines
...ator/impl/openstruct/OpenStructColumnSplitter.java 84.10% 26 Missing and 15 partials ⚠️
...gment/index/openstruct/MutableOpenStructIndex.java 67.39% 21 Missing and 9 partials ⚠️
...local/indexsegment/mutable/MutableSegmentImpl.java 13.04% 15 Missing and 5 partials ⚠️
...l/indexsegment/immutable/ImmutableSegmentImpl.java 25.00% 16 Missing and 2 partials ⚠️
.../index/openstruct/MutableOpenStructDataSource.java 57.50% 15 Missing and 2 partials ⚠️
...cal/segment/index/openstruct/MutableKeyColumn.java 60.00% 16 Missing ⚠️
.../segment/index/openstruct/OpenStructIndexType.java 48.00% 11 Missing and 2 partials ⚠️
...ndex/openstruct/ImmutableOpenStructDataSource.java 71.87% 9 Missing ⚠️
...he/pinot/segment/local/utils/TableConfigUtils.java 11.11% 6 Missing and 2 partials ⚠️
...local/segment/creator/impl/BaseSegmentCreator.java 40.00% 5 Missing and 1 partial ⚠️
... and 9 more
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18643      +/-   ##
============================================
+ Coverage     64.41%   64.44%   +0.02%     
- Complexity     1282     1291       +9     
============================================
  Files          3362     3373      +11     
  Lines        207907   208683     +776     
  Branches      32463    32588     +125     
============================================
+ Hits         133923   134476     +553     
- Misses        63221    63393     +172     
- Partials      10763    10814      +51     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (?)
java-21 64.44% <67.98%> (+0.02%) ⬆️
temurin 64.44% <67.98%> (+0.02%) ⬆️
unittests 64.43% <67.98%> (+0.02%) ⬆️
unittests1 56.62% <10.45%> (-0.19%) ⬇️
unittests2 37.26% <64.81%> (+0.10%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…x (PR 2/4)

Storage layer for the OPEN_STRUCT column type: a self-describing column
whose keys are discovered at ingest time and stored columnar in two tiers.

Mutable (consuming) path:
- MutableOpenStructIndex / MutableKeyColumn: per-key dictionary-encoded
  forward index + presence bitmap, with 3-level type resolution (declared
  child FieldSpec, else value-based inference) and maxDenseKeys capping.
- MutableOpenStructDataSource exposes per-key DataSources to the query layer
  via the OpenStructDataSource SPI; implemented as a MutableIndex.

Seal / offline path:
- OpenStructColumnSplitter classifies keys into dense vs sparse by fill rate
  (plus explicit denseKeys and maxDenseKeys), writes each dense key as a
  standard materialized column (col$key) with optional dictionary/inverted/
  null-vector, and packs the rest into a single sparse JSON column
  (col$__sparse__). Emits per-child and parent column metadata.
- BaseSegmentCreator merges the splitter's per-child metadata into the
  segment properties so each child loads as its own column.

Immutable (sealed) path:
- ImmutableSegmentImpl groups materialized children (parentColumn metadata)
  under an ImmutableOpenStructDataSource in a single pass over column metadata.
- ImmutableSegmentLoader keeps materialized children that are absent from the
  user-facing schema.

Wiring / validation:
- OpenStructIndexType + OpenStructIndexPlugin register the index; reader
  factory is a no-op (children load as standard columns).
- TableConfigUtils rejects user columns containing the reserved '$' separator
  when any field is OPEN_STRUCT.
- RealtimeSegmentStatsContainer returns EmptyColumnStatistics for the parent.
- V1Constants: PARENT_COLUMN, HAS_SPARSE_COLUMN; OpenStructNaming helpers
  including shared value->DataType inference.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@tarun11Mavani tarun11Mavani force-pushed the open-struct-storage branch from c6a3db5 to 72cf061 Compare June 1, 2026 08:29
tarun11Mavani and others added 11 commits June 1, 2026 09:32
A BIG_DECIMAL-valued OPEN_STRUCT key aborted segment seal: BIG_DECIMAL is
its own stored type (unlike BOOLEAN->INT, TIMESTAMP->LONG), so the splitter's
type switches hit `default: throw`. It is reachable via inferDataType for any
java.math.BigDecimal or an explicit child FieldSpec.

Add BIG_DECIMAL as a variable-length stored type alongside STRING/BYTES in
OpenStructColumnSplitter: getDefaultValue (BigDecimal.ZERO), dictionary build,
raw var-byte forward index (putBigDecimal), and dictionary-vs-raw sizing.
Dictionary dedup uses BigDecimal.equals (matching SegmentDictionaryCreator's
equals-keyed indexOfSV), not compareTo, so scale-differing equal values
(1.0 vs 1.00) stay distinct instead of silently resolving to dict id 0.

The realtime mutable path and segment min/max metadata already handle
BIG_DECIMAL and are unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tisticsCollector

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… stats collector

Treats absent docs as null docs holding the default value (standard model). For keys
with absent docs, CARDINALITY now counts the default and MIN/MAX include it. Intended.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…MetadataInfo

Replaces the hand-written emitVirtualColumnMetadata with the standard metadata
writer plus the OPEN_STRUCT-specific keys (PARENT_COLUMN, hasNullValue,
hasInvertedIndex). The shared writer additionally emits standard keys the old
code omitted, producing a superset of the previous metadata.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…istinctValuesPerKey

Replaces the _distinctValuesPerKey distinct-string-set tracking with the sealed
stats collector's cardinality and longest-element length. _totalRawBytesPerKey is
retained for the var-length raw-size estimate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
dictElementSize is already 0 on the raw path (only assigned when a dictionary
creator exists), so the conditional alias was a no-op. Pass it directly.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…reators

Drives each dense materialized child's forward, dictionary-id forward, and inverted index
through the standard ForwardIndexCreator / DictionaryBasedInvertedIndexCreator obtained from
StandardIndexes, using an IndexCreationContext built from the sealed stats collector (no
TableConfig). The dict-vs-raw decision now mirrors BaseSegmentCreator.createDictionaryForColumn
(standard default flags), and absent docs store the Pinot dimension null default. Deletes
writeRawForwardIndex, shouldUseDictionary, getDefaultValue, and _totalRawBytesPerKey.

Behavior changes (intended, unreleased feature): absent-doc defaults are now dimension nulls
(STRING "null", INT MIN_VALUE, ...); the local size-ratio dict downgrade is removed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…riteDenseKeyColumn

Split the ~144-line writeDenseKeyColumn into a focused orchestrator plus two
private helpers, mirroring how BaseSegmentCreator decomposes column creation:

- resolveUseDictionary(...): the three-step dict-vs-raw decision.
- writeForwardAndInvertedIndexes(...): dictionary + forward + inverted creation,
  the per-doc add loop, and the nested resource handling; returns the dictionary
  element size for metadata.

Behavior-preserving — no logic, ordering, or resource-management change. Covered
by the existing OpenStructColumnSplitterTest (16 tests).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…o OpenStructTypeInference

inferDataType performs value->DataType inference, an orthogonal concern to
OpenStructNaming's column-name string mapping (it uses none of the naming
constants). Relocate it to a dedicated OpenStructTypeInference class in the same
package so each class has a single responsibility; update the two callers.

Behavior-preserving — the method body is moved verbatim. The core Object
classification still delegates to PinotDataType.getSingleValueType; the
remaining PinotDataType->DataType switch is OPEN_STRUCT-specific policy that
exists nowhere else.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…taType

Data-driven test locking the value->DataType mapping policy now that it lives in
its own class: each return branch (INT via all four widening inputs, LONG, FLOAT,
DOUBLE, BIG_DECIMAL, BOOLEAN, TIMESTAMP, STRING via all four folding inputs,
BYTES) plus the unrepresentable cases (Map, bare Object) returning null.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants