metal-flash-attention support #22

czkoko · 2023-08-21T15:24:15Z

Can this project help for you? https://github.com/philipturner/metal-flash-attention

So far, metal-flash-attention can indeed provide the fastest generation speed for stable diffusion on MacOS.

leejet · 2023-08-21T15:28:33Z

Thank you for the feedback. I'm currently focusing on making it run faster, and I'll make time to take a look at this project and see if I can offer any assistance.

sroussey · 2023-08-21T17:42:10Z

I came here to suggest the same thing.

Green-Sky · 2023-08-21T18:37:11Z

linking this here for reference ggerganov/ggml#293

GaidamakUA · 2023-10-03T06:13:46Z

What needs to be done to make this happen? I'm not very good with cpp, but I want to help.

Green-Sky · 2024-09-12T08:33:04Z

#386
I am in the dark on metal flash attention support, or metal support in general.
So would be nice if someone with the hardware could test the pr. :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

metal-flash-attention support #22

metal-flash-attention support #22

czkoko commented Aug 21, 2023

leejet commented Aug 21, 2023

sroussey commented Aug 21, 2023

Green-Sky commented Aug 21, 2023

GaidamakUA commented Oct 3, 2023

Green-Sky commented Sep 12, 2024 •

edited

Loading

metal-flash-attention support #22

metal-flash-attention support #22

Comments

czkoko commented Aug 21, 2023

leejet commented Aug 21, 2023

sroussey commented Aug 21, 2023

Green-Sky commented Aug 21, 2023

GaidamakUA commented Oct 3, 2023

Green-Sky commented Sep 12, 2024 • edited Loading

Green-Sky commented Sep 12, 2024 •

edited

Loading