Skip to content

API, Core: add support for new property, write.parquet.format-version#16763

Open
The-Alchemist wants to merge 1 commit into
apache:mainfrom
The-Alchemist:feature/added-config-for-parquet-format-version
Open

API, Core: add support for new property, write.parquet.format-version#16763
The-Alchemist wants to merge 1 commit into
apache:mainfrom
The-Alchemist:feature/added-config-for-parquet-format-version

Conversation

@The-Alchemist

Copy link
Copy Markdown

Added write.parquet.format-version (v1 or v2) to control Parquet writer output. write.parquet.page-version still applies when format-version isn't set. Both properties share the same v1/v2 parsing. Wired it through Spark and Flink write configs. Almost entirely a pass-through.

Open question: do we need write.delete.parquet.format-version and write.delete.parquet.page-version? They follow the same pattern as write.delete.parquet.compression-codec but I'm not sure how often anyone will tune delete files separately from data files. Also raises the issue of incompatibilities between format versions and page versions.

@The-Alchemist The-Alchemist changed the title feat(parquet): add support for new property, write.parquet.format-version API, Core: add support for new property, write.parquet.format-version Jun 10, 2026
@The-Alchemist The-Alchemist force-pushed the feature/added-config-for-parquet-format-version branch from df02a96 to 815f6b1 Compare June 10, 2026 22:30
…sion

Added write.parquet.format-version (v1 or v2) to control Parquet writer
output. write.parquet.page-version still applies when format-version
isn't set. Both properties share the same v1/v2 parsing. Wire it
through Spark and Flink write configs; delete files inherit data-file
settings unless overridden.

Signed-off-by: Karl Pietrzak <karl@medplum.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant