Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs: Nit: Fix parquet default compression codec #9096

Merged
merged 2 commits into from
Nov 16, 2023

Conversation

tomtongue
Copy link
Contributor

@tomtongue tomtongue commented Nov 16, 2023

The parquet default compression codec is still gzip in the doc.

I tested the Iceberg 1.4.2 with Spark, and it's zstd from my check:

# DESCRIBE EXTENDED TABLE
...
|Provider                    |iceberg                                                                                                                                                     |       |
|Owner                       |spark                                                                                                                                                       |       |
|Table Properties            |[current-snapshot-id=5229490619909685802,format=iceberg/parquet,format-version=2,write.metadata.compression-codec=gzip,write.parquet.compression-codec=zstd]|       |
+----------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+

A Parquet file that was written by the Spark app:

$ parquet footer 00000-0-81cf5185-9ecd-424f-9f65-f93a84f3e390-00001.parquet | grep codec | head -1
        "codec" : "ZSTD",

In addition to my check, the following release notes say:
https://iceberg.apache.org/releases/

Use zstd compression for Parquet by default in new tables (#8593)

Confirmed the default compression-codec for the other file formats like avro and orc is correct.

@github-actions github-actions bot added the docs label Nov 16, 2023
@@ -47,51 +47,51 @@ Iceberg tables support table properties to configure table behavior, like the de

### Write properties

| Property | Default | Description |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomtongue thanks for updating this. Can you please update the PR so that there's a diff only for that one single line that needs the change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nastra thanks for quickly reviewing this. And sorry for adding unnecessary spaces (I missed they were automatically added by the editor). Remove the spaces.

@tomtongue tomtongue requested a review from nastra November 16, 2023 15:33
Copy link
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @tomtongue

@nastra nastra merged commit 72da856 into apache:main Nov 16, 2023
2 checks passed
@tomtongue tomtongue deleted the doc-parquet-codec-fix branch November 16, 2023 15:58
devangjhabakh pushed a commit to cdouglas/iceberg that referenced this pull request Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants