Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-7087][CH] Support WindowGroupLimitExec #7176

Merged
merged 3 commits into from
Sep 12, 2024
Merged

Conversation

lgbo-ustc
Copy link
Contributor

What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

Fixes: #7087

WindowGroupLimitExec is introduced in spark-3.5. This PR makes it supported

How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)

unit tests

(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

@github-actions github-actions bot added CORE works for Gluten Core CLICKHOUSE labels Sep 10, 2024
Copy link

#7087

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@@ -166,6 +167,8 @@ JNIEXPORT jint JNI_OnLoad(JavaVM * vm, void * /*reserved*/)
local_engine::BroadCastJoinBuilder::init(env);
local_engine::CacheManager::initJNI(env);

DB::registerFormats();
Copy link
Contributor

@baibaichen baibaichen Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this PR need to call DB:: registerFormats()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dump block needs formats.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding a simple method to dump block contents is very useful for debug

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean calling DB:: registerFormats() is for later dumping?

@@ -315,6 +316,16 @@ DB::Block BlockUtil::concatenateBlocksMemoryEfficiently(std::vector<DB::Block> &
return out;
}

String BlockUtil::dumpBlock(const DB::Block & block)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are about moving to BlockTypeUtils.h/.cpp ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BlockTypeUtils.h/.cpp is just about types?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it's block and type utils :)

Copy link
Contributor

@baibaichen baibaichen Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If dumping is just for debug aid, I think we should move it to DebugUtils.h/cpp, and put it to debug namespace. In product code, we should put it into #ifdef NDEBUG .. #endif. e.g.

#ifdef NDEBUG
 debug::dumpxxx();
#endif

image

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@lgbo-ustc lgbo-ustc merged commit afccfed into apache:main Sep 12, 2024
46 checks passed
sharkdtu pushed a commit to sharkdtu/gluten that referenced this pull request Nov 11, 2024
* support WindowGroupLimit

* 0903

* implement window group limit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLICKHOUSE CORE works for Gluten Core
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CH] Support WindowGroupLimit for row_number, rank and dense_rank
2 participants