-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
have activelaneid_u32 call ockl_activelane_u32 #1018
Conversation
Is there a test that checks that all/most of the __foo functions compile? If so it needs updates too. |
sorry, I should have already had a test ready. Let me cook one up, and update this PR. |
f74a870
to
716cb85
Compare
tests/Unit/AMDGPU/activelaneid.cpp
Outdated
int k1 = int_dist(rd); | ||
int k2 = int_dist(rd); | ||
if (k1 != k2) { | ||
test[i * WAVEFRONT_SIZE + k1] ^= test[i * WAVEFRONT_SIZE + k2] ^= test[i * WAVEFRONT_SIZE + k1] ^= test[i * WAVEFRONT_SIZE + k2]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just code a swap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this test (I based it on the existing "activelanecount.cpp" test) most of this set up of "test" can be scrapped, I think. We can just create a PFE, with a suitable extent, and an output grid/vector and call activelaneid to verify that we get correct/valid laneid values.
tests/Unit/AMDGPU/activelaneid.cpp
Outdated
|
||
for (int i = 0; i < WAVEFRONT_SIZE; ++i) { | ||
for (int j = 0; j < WAVEFRONT_SIZE; ++j) { | ||
#if TEST_DEBUG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move this outside the entire loop nest
tests/Unit/AMDGPU/activelaneid.cpp
Outdated
array<uint32_t, 1> output_GPU(GRID_SIZE); | ||
extent<1> ex(GRID_SIZE); | ||
parallel_for_each(ex, [&](index<1>& idx) [[hc]] { | ||
output_GPU(idx) = __activelaneid_u32(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every lane is sure to be active. Control flow is needed to deactivate some lanes.
@david-salinas ping... |
I'm not sure I'm seeing the latest version of the test code. I agree it can be small, but the kernel should have some control flow so that we can see that activelaneid is reporting the correct ID on lanes that are truly active. |
I don't see any update to the code either |
Sorry, I was about to push an update test, when I realized I need to add some control flow to deactivate some lanes. I should have an update shortly. |
716cb85
to
c8d80ea
Compare
for #1011