We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Describe the bug SDFA currently does not support head dimension outside of 64, 96, and 128.
To Reproduce Fused attention falls back to regular operation when head dimension is not in (64, 96, 128) https://github.com/ml-explore/mlx/blob/main/mlx/fast.cpp#L644-L645
Request SDFA to support non-common head dimensions (still multiple of 32)
Desktop (please complete the following information):
Additional context Awni's suggestion: «generalize it so that any even head dim or maybe multiple of 32 is supported»
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Describe the bug
SDFA currently does not support head dimension outside of 64, 96, and 128.
To Reproduce
Fused attention falls back to regular operation when head dimension is not in (64, 96, 128)
https://github.com/ml-explore/mlx/blob/main/mlx/fast.cpp#L644-L645
Request
SDFA to support non-common head dimensions (still multiple of 32)
Desktop (please complete the following information):
Additional context
Awni's suggestion: «generalize it so that any even head dim or maybe multiple of 32 is supported»
The text was updated successfully, but these errors were encountered: