Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement Request] SDFA does not support head dimension of size 192 (capped at 128) #1604

Open
s-tlgh opened this issue Nov 19, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@s-tlgh
Copy link

s-tlgh commented Nov 19, 2024

Describe the bug
SDFA currently does not support head dimension outside of 64, 96, and 128.

To Reproduce
Fused attention falls back to regular operation when head dimension is not in (64, 96, 128)
https://github.com/ml-explore/mlx/blob/main/mlx/fast.cpp#L644-L645

Request
SDFA to support non-common head dimensions (still multiple of 32)

Desktop (please complete the following information):

  • OS Version: MacOS 15.2
  • MLX Version 0.20.0

Additional context
Awni's suggestion: «generalize it so that any even head dim or maybe multiple of 32 is supported»

@awni awni added the enhancement New feature or request label Nov 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants