build: Enable Spark query runner as reference in aggregation fuzzer test #9559

rui-mo · 2024-04-22T01:01:35Z

Enables Spark query runner as reference DB in Spark aggregation fuzzer test.
Removes duplicate Spark aggregation fuzzer test from experimental run.

Fixes #9270.

netlify · 2024-04-22T01:01:51Z

✅ Deploy Preview for meta-velox canceled.

Name	Link
🔨 Latest commit	`7283cac`
🔍 Latest deploy log	https://app.netlify.com/sites/meta-velox/deploys/674d513ee547960007274cac

velox/exec/tests/CMakeLists.txt

rui-mo · 2024-05-14T00:55:06Z

Hi @mbasmanova, this is a draft PR to enable Spark query runner in aggregate fuzzer test. I'd like to work on below items in separate PRs. Could you spare some time to check if they make sense? Thank you.

Add support for docker image with Spark connect server #9759
Refactor the toSql by introducing helper functions to allow reusability for Spark.
Introduce SparkQueryRunner and enable it in agg fuzzer test.

mbasmanova · 2024-05-14T14:04:57Z

@rui-mo Rui, thank you for working on this. At a high-level next steps make sense to me, but there are not enough details for me to understand them fully.

Fix the intermediate type of Sum aggregate discovered when running fuzzer test.

Would you create a GitHub issue to describe this problem?

Introduce a configuration to specify if the vectors generated by VectorFuzzer are for Arrow parquet. We need it because exportFlattenedVector method in Bridge.cpp does not support nested encoding and non-scalar type, which would breaks parquet write.

Does this mean we won't be able to create tables with map/array/struct columns? If so, this is a pretty severe limitation.

rui-mo · 2024-05-15T06:45:11Z

@mbasmanova Thanks for your reply.
For the sum aggregate issue, I notice #9818 is addressing the same issue as discovered by this fuzzer test.
For the second one, I created #9821 and provided some details. Thanks.

majetideepak

@rui-mo CPP code changes look good.
@assignUser can you please take another look at the workflow file changes? Thanks.

majetideepak · 2024-10-29T14:20:03Z

.github/workflows/experimental.yml

-    runs-on: ubuntu-latest
-    needs: compile
+  spark-java-aggregation-fuzzer-run:
+    runs-on: 16-core-ubuntu


@assignUser FYI

assignUser · 2024-10-30T00:32:45Z

.github/workflows/scheduled.yml

-    timeout-minutes: 60
+    timeout-minutes: 120


Why is this change necessary? The usual duration is still 15/30mins. Did you actually see a timeout?

Got it. This change is not necessary and just reverted it.

.github/workflows/experimental.yml

rui-mo · 2024-10-30T09:48:53Z

@assignUser Fixed above comments. Would you like to take another look? Thanks.

assignUser

Thanks!

facebook-github-bot · 2024-11-04T13:41:38Z

@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

mbasmanova · 2024-11-04T14:44:08Z

@rui-mo Would you rebase to allow merging?

facebook-github-bot · 2024-11-05T10:06:14Z

@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

rui-mo · 2024-11-12T00:38:31Z

Hi @kevinwilfong, could you help import and merge this PR? Thanks!

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 22, 2024

rui-mo changed the title ~~Support Spark query runner~~ WIP: Support Spark query runner Apr 22, 2024

majetideepak reviewed Apr 24, 2024

View reviewed changes

velox/exec/tests/CMakeLists.txt Outdated Show resolved Hide resolved

rui-mo force-pushed the wip_9270 branch 2 times, most recently from f5dcf32 to fccac50 Compare April 25, 2024 00:50

rui-mo force-pushed the wip_9270 branch from fccac50 to da7d81e Compare May 9, 2024 01:59

PHILO-HE mentioned this pull request May 9, 2024

Register some re-usable Presto functions for Spark #9425

Closed

rui-mo force-pushed the wip_9270 branch 6 times, most recently from 4dd93a7 to 5d96887 Compare May 14, 2024 00:19

rui-mo force-pushed the wip_9270 branch from 5d96887 to b4a82d6 Compare May 14, 2024 01:14

rui-mo force-pushed the wip_9270 branch 5 times, most recently from 5400aed to 61c3f07 Compare May 17, 2024 06:10

rui-mo force-pushed the wip_9270 branch from 61c3f07 to a356700 Compare June 20, 2024 03:12

rui-mo changed the title ~~WIP: Support Spark query runner~~ WIP: Enable Spark query runner as reference in aggregation fuzzer test Jun 20, 2024

rui-mo force-pushed the wip_9270 branch 5 times, most recently from c5b6b9e to 52a74bd Compare June 25, 2024 01:46

rui-mo force-pushed the wip_9270 branch from f2a92fa to 9705919 Compare September 26, 2024 03:43

rui-mo force-pushed the wip_9270 branch from 9705919 to abca0d8 Compare October 23, 2024 09:00

rui-mo requested a review from assignUser as a code owner October 23, 2024 09:00

rui-mo force-pushed the wip_9270 branch 2 times, most recently from ef10ffc to d0b6ed9 Compare October 29, 2024 02:04

majetideepak reviewed Oct 29, 2024

View reviewed changes

assignUser reviewed Oct 30, 2024

View reviewed changes

rui-mo force-pushed the wip_9270 branch 2 times, most recently from 717c2a3 to ba10f70 Compare October 30, 2024 07:24

rui-mo commented Oct 30, 2024

View reviewed changes

.github/workflows/experimental.yml Show resolved Hide resolved

assignUser approved these changes Nov 2, 2024

View reviewed changes

assignUser added the ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall label Nov 2, 2024

mbasmanova requested a review from kgpai November 4, 2024 13:41

rui-mo force-pushed the wip_9270 branch from ba10f70 to 1e227c4 Compare November 5, 2024 01:13

rui-mo force-pushed the wip_9270 branch 6 times, most recently from 7612685 to aab3097 Compare November 11, 2024 06:53

rui-mo force-pushed the wip_9270 branch from aab3097 to 32b100c Compare November 13, 2024 04:10

rui-mo force-pushed the wip_9270 branch from 32b100c to 140adc2 Compare November 21, 2024 06:31

rui-mo changed the title ~~Enable Spark query runner as reference in aggregation fuzzer test~~ build: Enable Spark query runner as reference in aggregation fuzzer test Nov 21, 2024

Enable Spark query runner in aggregate fuzzer test

7283cac

rui-mo force-pushed the wip_9270 branch from 140adc2 to 7283cac Compare December 2, 2024 06:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

build: Enable Spark query runner as reference in aggregation fuzzer test #9559

build: Enable Spark query runner as reference in aggregation fuzzer test #9559

rui-mo commented Apr 22, 2024 •

edited

Loading

netlify bot commented Apr 22, 2024 •

edited

Loading

rui-mo commented May 14, 2024 •

edited

Loading

mbasmanova commented May 14, 2024

rui-mo commented May 15, 2024

majetideepak left a comment

majetideepak Oct 29, 2024

assignUser Oct 30, 2024

rui-mo Oct 30, 2024

rui-mo commented Oct 30, 2024

assignUser left a comment

facebook-github-bot commented Nov 4, 2024

mbasmanova commented Nov 4, 2024

facebook-github-bot commented Nov 5, 2024

rui-mo commented Nov 12, 2024

build: Enable Spark query runner as reference in aggregation fuzzer test #9559

Are you sure you want to change the base?

build: Enable Spark query runner as reference in aggregation fuzzer test #9559

Conversation

rui-mo commented Apr 22, 2024 • edited Loading

netlify bot commented Apr 22, 2024 • edited Loading

✅ Deploy Preview for meta-velox canceled.

rui-mo commented May 14, 2024 • edited Loading

mbasmanova commented May 14, 2024

rui-mo commented May 15, 2024

majetideepak left a comment

Choose a reason for hiding this comment

majetideepak Oct 29, 2024

Choose a reason for hiding this comment

assignUser Oct 30, 2024

Choose a reason for hiding this comment

rui-mo Oct 30, 2024

Choose a reason for hiding this comment

rui-mo commented Oct 30, 2024

assignUser left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Nov 4, 2024

mbasmanova commented Nov 4, 2024

facebook-github-bot commented Nov 5, 2024

rui-mo commented Nov 12, 2024

rui-mo commented Apr 22, 2024 •

edited

Loading

netlify bot commented Apr 22, 2024 •

edited

Loading

rui-mo commented May 14, 2024 •

edited

Loading