Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: Add ManifestWrite benchmark #8637

Merged
merged 9 commits into from
Aug 20, 2024
Merged

Core: Add ManifestWrite benchmark #8637

merged 9 commits into from
Aug 20, 2024

Conversation

Fokko
Copy link
Contributor

@Fokko Fokko commented Sep 25, 2023

No description provided.

@github-actions github-actions bot added the core label Sep 25, 2023
manifestListFile = String.format("%s/%s.avro", baseDir, UUID.randomUUID());

try (ManifestListWriter listWriter =
ManifestLists.write(1, org.apache.iceberg.Files.localOutput(manifestListFile), 0, 1L, 0)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we also add a parameter for format version? I guess if would be great if we can test both format version but not really needed in same PR

Copy link
Contributor

@aokolnychyi aokolnychyi Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, it would be nice to cover both formats. We can use @Param annotation.
Like in IcebergSourceParquetPosDeleteBenchmark, for instance.

@Benchmark
@Threads(1)
public void writeManifestFile() throws IOException {
baseDir = Files.createTempDir().getAbsolutePath();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: We frequently use this.varName prefix when setting (not getting) instance variable to highlight that it is an instance variable, not a local variable. It is optional, up to you @Fokko. If you decide to apply that, must be done to all places that assign instance variables.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a pythonista; explicit is better than implicit. I've added this.

try (ManifestWriter<DataFile> finalWriter = writer) {
for (int j = 0; j < NUM_ROWS; j++) {
DataFile dataFile =
DataFiles.builder(PartitionSpec.unpartitioned())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to cover partitioned and delete manifests (could be a separate PR).

Copy link
Contributor Author

@Fokko Fokko Dec 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do it in a separate PR 👍 It feels to me that the benchmark are very much on a need to benchmark basis

Copy link
Contributor

@aokolnychyi aokolnychyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems really close. I agree with comments from @dramaticlly. Could you take a look, @Fokko? Let's get this in.

@aokolnychyi
Copy link
Contributor

There was a style violation:

Error: eckstyle] [ERROR] /home/runner/work/iceberg/iceberg/core/src/jmh/java/org/apache/iceberg/ManifestWriteBenchmark.java:89:9: Variable 'formatVersion' must be private and have accessor methods. [VisibilityModifier]

@Fokko Fokko merged commit 2f2c367 into apache:main Aug 20, 2024
46 checks passed
@Fokko Fokko deleted the fd-add-benchmark branch August 20, 2024 13:08
zachdisc pushed a commit to zachdisc/iceberg that referenced this pull request Dec 23, 2024
* Core: Add ManifestWrite benchmark

* Thanks for the review!

* Cleanup

* Set timeout

* Remove public

Co-authored-by: Hongyue/Steve Zhang <[email protected]>

* Make ErrorProne happy

---------

Co-authored-by: Hongyue/Steve Zhang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants