Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CORE][VL][CH] Make Iceberg code implement component API #8192

Merged
merged 7 commits into from
Dec 10, 2024

Conversation

zhztheplayer
Copy link
Member

@zhztheplayer zhztheplayer commented Dec 10, 2024

Add CHIcebergComponent / VeloxIcebergCompoent to replace the previous injections through ScanTransformerFactory.

Component API was added in this PR: #8143.

With other minor refactors.

Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

Copy link

Run Gluten Clickhouse CI on x86

Copy link

Run Gluten Clickhouse CI on x86

@zhztheplayer zhztheplayer changed the title [CORE][VL][CH] Make Iceberg support a component of Gluten [CORE][VL][CH] Make Iceberg code implement component API Dec 10, 2024
Copy link

Run Gluten Clickhouse CI on x86

Copy link

Run Gluten Clickhouse CI on x86

Comment on lines +39 to +55
injector.gluten.legacy.injectTransform {
c =>
val offload = Seq(OffloadIcebergScan())
HeuristicTransform.Simple(
Validators.newValidator(c.glutenConf, offload),
offload
)
}

// Inject RAS rule.
injector.gluten.ras.injectRasRule {
c =>
RasOffload.Rule(
RasOffload.from[BatchScanExec](OffloadIcebergScan()),
Validators.newValidator(c.glutenConf),
Nil)
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code can be simplified in future PRs.

Copy link

Run Gluten Clickhouse CI on x86

@zhztheplayer zhztheplayer marked this pull request as ready for review December 10, 2024 05:41
Copy link

Run Gluten Clickhouse CI on x86

Copy link

Run Gluten Clickhouse CI on x86

@zhztheplayer
Copy link
Member Author

cc @liujiayi771 @zzcclp

@@ -94,4 +98,8 @@ object IcebergScanTransformer {
commonPartitionValues = SparkShimLoader.getSparkShims.getCommonPartitionValues(batchScan)
)
}

def supportsBatchScan(scan: Scan): Boolean = {
scan.getClass.getName == "org.apache.iceberg.spark.source.SparkBatchQueryScan"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just reference this class to get its name for comparison?

Copy link
Member Author

@zhztheplayer zhztheplayer Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SparkBatchQueryScan is package private.

cc @liujiayi771 if there are some better practices for this.

class CHIcebergComponent extends Component {
override def name(): String = "clickhouse-iceberg"
override def buildInfo(): Component.BuildInfo =
Component.BuildInfo("ClickHouseIceberg", "N/A", "N/A", "N/A")
Copy link
Contributor

@PHILO-HE PHILO-HE Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems buildInfo is not necessary for component except Backend. Maybe, just use it for Backend in future refactor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this part needs some cleanup. BuildInfo is not that general indeed. I'd like to see it totally decoupled with Component API somehow.

@zhztheplayer zhztheplayer merged commit f37f459 into apache:main Dec 10, 2024
46 checks passed
yikf pushed a commit to yikf/incubator-gluten that referenced this pull request Dec 13, 2024
yikf pushed a commit to yikf/incubator-gluten that referenced this pull request Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants