-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spark 3.5: Fix metrics reporting in distributed planning #8602
Spark 3.5: Fix metrics reporting in distributed planning #8602
Conversation
@@ -53,6 +53,10 @@ public BaseTable(TableOperations ops, String name, MetricsReporter reporter) { | |||
this.reporter = reporter; | |||
} | |||
|
|||
MetricsReporter reporter() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The simplest approach is to just expose the reporter with package-private access.
We could have an interface called HasMetricsReporter
, similar to HasTableOperations
. It will be a bit more generic but it is not required for this PR. Let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be fine. In fact, I'm making it public in https://github.com/apache/iceberg/pull/8032/files#diff-725c9531623b4e47248f10a8caac9b18ead6906101efdf7e8c14c6363b3b6d59R267
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, good to know!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @aokolnychyi
Thanks, @nastra! |
This PR fixes metrics reporting in distributed planning:
SparkDistributedDataScan
picks up the base reporter configured for the table.SparkDistributedDataScan
correctly computes the number of skipped data and delete files.