Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GQA and KV Cache #1696

Open
wants to merge 8 commits into
base: develop
Choose a base branch
from

Conversation

dhernandez0
Copy link
Contributor

@dhernandez0 dhernandez0 commented Nov 18, 2024

This PR adds support for:

  1. Tests for GQA and rocmlir-gen GQA implementation (--num_heads_q and --num_heads_kv)
  2. KV Cache implementation: currentSeqLen to handle dynamic number of seqLen for K and V.

Note that MIGraphX integration will be done in a next ticket (we need to sync with them to decide how to do this). So, for now, TosaToRock sets currentSeqLen = nullptr.

@dhernandez0 dhernandez0 self-assigned this Nov 18, 2024
@dhernandez0 dhernandez0 changed the title Add GQA support for rocmlir-gen and add GQA tests Add GQA and KV Cache Nov 20, 2024
@dhernandez0 dhernandez0 force-pushed the 1655-groupqueryattention-with-kv-cache branch from 0f3912b to b84f77d Compare November 28, 2024 12:25
Copy link

codecov bot commented Nov 28, 2024

Codecov Report

Attention: Patch coverage is 85.42199% with 57 lines in your changes missing coverage. Please review.

Project coverage is 78.25%. Comparing base (64ccaae) to head (f248ae5).
Report is 1 commits behind head on develop.

Files with missing lines Patch % Lines
mlir/tools/rocmlir-gen/rocmlir-gen.cpp 89.63% 18 Missing and 5 partials ⚠️
mlir/lib/Dialect/Rock/IR/RockDialect.cpp 27.58% 13 Missing and 8 partials ⚠️
...ialect/Rock/Transforms/GridwiseGemmToBlockwise.cpp 92.37% 7 Missing and 2 partials ⚠️
.../Dialect/Rock/Transforms/AffixTuningParameters.cpp 25.00% 3 Missing ⚠️
mlir/lib/Dialect/Rock/utility/builderUtils.cpp 92.85% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1696      +/-   ##
===========================================
+ Coverage    78.22%   78.25%   +0.02%     
===========================================
  Files          100      100              
  Lines        28009    28346     +337     
  Branches      4097     4130      +33     
===========================================
+ Hits         21910    22182     +272     
- Misses        4434     4484      +50     
- Partials      1665     1680      +15     
Flag Coverage Δ
mfma 78.25% <85.42%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants