Skip to content

pyiceberg.io.pyarrow.write_file does not take into account compression settings #345

@jonashaag

Description

@jonashaag

Apache Iceberg version

main (development)

Please describe the bug 🐞

I expected table.append() to take my write.parquet.compression-codec etc settings into account but it looks like no settings are passed to ParquetWriter:

with pq.ParquetWriter(fos, schema=file_schema, version="1.0", metadata_collector=collected_metrics) as writer:

For my data this means a 5x size increase compared to using zstd level 3

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions