Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

have activelaneid_u32 call ockl_activelane_u32 #1018

Merged
merged 1 commit into from
Feb 14, 2019

Conversation

david-salinas
Copy link
Collaborator

for #1011

@b-sumner
Copy link

b-sumner commented Feb 1, 2019

Is there a test that checks that all/most of the __foo functions compile? If so it needs updates too.

@david-salinas
Copy link
Collaborator Author

Is there a test that checks that all/most of the __foo functions compile? If so it needs updates too.

sorry, I should have already had a test ready. Let me cook one up, and update this PR.

int k1 = int_dist(rd);
int k2 = int_dist(rd);
if (k1 != k2) {
test[i * WAVEFRONT_SIZE + k1] ^= test[i * WAVEFRONT_SIZE + k2] ^= test[i * WAVEFRONT_SIZE + k1] ^= test[i * WAVEFRONT_SIZE + k2];
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just code a swap?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this test (I based it on the existing "activelanecount.cpp" test) most of this set up of "test" can be scrapped, I think. We can just create a PFE, with a suitable extent, and an output grid/vector and call activelaneid to verify that we get correct/valid laneid values.


for (int i = 0; i < WAVEFRONT_SIZE; ++i) {
for (int j = 0; j < WAVEFRONT_SIZE; ++j) {
#if TEST_DEBUG
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move this outside the entire loop nest

array<uint32_t, 1> output_GPU(GRID_SIZE);
extent<1> ex(GRID_SIZE);
parallel_for_each(ex, [&](index<1>& idx) [[hc]] {
output_GPU(idx) = __activelaneid_u32();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every lane is sure to be active. Control flow is needed to deactivate some lanes.

tests/Unit/AMDGPU/activelaneid.cpp Show resolved Hide resolved
include/hc.hpp Outdated Show resolved Hide resolved
include/hc.hpp Outdated Show resolved Hide resolved
@scchan
Copy link
Collaborator

scchan commented Feb 11, 2019

@david-salinas ping...

@b-sumner
Copy link

I'm not sure I'm seeing the latest version of the test code. I agree it can be small, but the kernel should have some control flow so that we can see that activelaneid is reporting the correct ID on lanes that are truly active.

@scchan
Copy link
Collaborator

scchan commented Feb 11, 2019

I don't see any update to the code either

@david-salinas
Copy link
Collaborator Author

Sorry, I was about to push an update test, when I realized I need to add some control flow to deactivate some lanes. I should have an update shortly.

@scchan scchan merged commit 4691670 into ROCm:clang_tot_upgrade Feb 14, 2019
gargrahul pushed a commit to gargrahul/hcc that referenced this pull request Feb 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants