Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMD] Fix unhandled profile event in RoctracerProfiler #5252

Merged
merged 1 commit into from
Nov 26, 2024

Conversation

AlexAUT
Copy link
Contributor

@AlexAUT AlexAUT commented Nov 25, 2024

Fixes proton unit tests when upgrading to rocm 6.2 by adding missing event handlers.
Magic number is replaced with the corresponding enum value which was added by upgrading the HIP headers #5077.

New contributor declaration

  • I am not making a trivial change, such as fixing a typo in a comment.

  • I have written a PR description following these
    rules.

  • I have run pre-commit run --from-ref origin/main --to-ref HEAD.

  • Select one of the following.

    • I have added tests.
      • /test for lit tests
      • /unittest for C++ tests
      • /python/test for end-to-end tests
    • This PR does not need a test because Tests already cover the issue.
  • Select one of the following.

    • I have not added any lit tests.
    • The lit tests I have added follow these best practices,
      including the "tests should be minimal" section. (Usually running Python code
      and using the instructions it generates is not minimal.)

@AlexAUT AlexAUT mentioned this pull request Nov 25, 2024
7 tasks
@CRobeck
Copy link
Contributor

CRobeck commented Nov 25, 2024

The reason for the unit tests failing was just an undefined event? I feel like the tests shouldn't fail in that case, just not capture the unknown event?

@@ -135,7 +136,7 @@ void processActivity(RoctracerProfiler::CorrIdToExternIdMap &corrIdToExternId,
const roctracer_record_t *record, bool isAPI,
bool isGraph) {
switch (record->kind) {
case 0x11F1: // Task - kernel enqueued by graph launch
case kHipVdiCommandTask:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replacing the magic number is a NFC right? It's just now we have a named variable in the HIP headers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NFC? The hip headers are included with Triton so there should not be any issues as this is a drop in replacement as the enum maps to the same number.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NFC - no functional change.

@antiagainst
Copy link
Collaborator

The reason for the unit tests failing was just an undefined event? I feel like the tests shouldn't fail in that case, just not capture the unknown event?

It's for an upcoming docker change that bumps ROCm version; see the discussion here #5230 (comment)

@CRobeck
Copy link
Contributor

CRobeck commented Nov 25, 2024

Right, I get that but I'm trying to figure out "what" specifically changes when we bump the ROCm version. Did we change the HIP event which tracks graph launches?

@antiagainst antiagainst marked this pull request as ready for review November 26, 2024 18:15
@antiagainst antiagainst merged commit 68a08dd into triton-lang:main Nov 26, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants