-
Notifications
You must be signed in to change notification settings - Fork 305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
metal-flash-attention support #22
Comments
Thank you for the feedback. I'm currently focusing on making it run faster, and I'll make time to take a look at this project and see if I can offer any assistance. |
I came here to suggest the same thing. |
linking this here for reference ggerganov/ggml#293 |
What needs to be done to make this happen? I'm not very good with cpp, but I want to help. |
#386 |
Can this project help for you? https://github.com/philipturner/metal-flash-attention
So far, metal-flash-attention can indeed provide the fastest generation speed for stable diffusion on MacOS.
The text was updated successfully, but these errors were encountered: