New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[Enhancement Request] SDFA does not support head dimension of size 192 (capped at 128) #1604

Open

s-tlgh opened this issue Nov 19, 2024 · 0 comments

Labels

s-tlgh commented Nov 19, 2024

Describe the bug
SDFA currently does not support head dimension outside of 64, 96, and 128.

To Reproduce
Fused attention falls back to regular operation when head dimension is not in (64, 96, 128)
https://github.com/ml-explore/mlx/blob/main/mlx/fast.cpp#L644-L645

Request
SDFA to support non-common head dimensions (still multiple of 32)

Desktop (please complete the following information):

OS Version: MacOS 15.2
MLX Version 0.20.0

Additional context
Awni's suggestion: «generalize it so that any even head dim or maybe multiple of 32 is supported»

awni added the enhancement label

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment