From ac77da5738c4414d31bd027749f1ad665791ca29 Mon Sep 17 00:00:00 2001 From: Kashif Rasul Date: Mon, 2 Dec 2024 11:51:20 +0100 Subject: [PATCH] Update 2024-08-07-flexattention.md --- _posts/2024-08-07-flexattention.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/_posts/2024-08-07-flexattention.md b/_posts/2024-08-07-flexattention.md index 4c34879d33b6..f8103cacec18 100644 --- a/_posts/2024-08-07-flexattention.md +++ b/_posts/2024-08-07-flexattention.md @@ -131,7 +131,7 @@ Alibi is similar to relative positional encodings with one exception \- it has a alibi_bias = generate_alibi_bias() # [num_heads] def alibi(score, b, h, q_idx, kv_idx): - bias = alibi_bias[h] * (q_idx - kv_idx) + bias = alibi_bias[h] * (kv_idx - q_idx) return score + bias ``` @@ -479,4 +479,4 @@ We want to highlight some prior work (and people) that have inspired FlexAttenti - The Jax team's work on SplashAttention - Philippe Tillet and Keren Zhou for helping us with Triton - Ali Hassani for discussions on neighborhood attention -- Everybody who's complained about attention kernels not supporting their favorite attention variant :) \ No newline at end of file +- Everybody who's complained about attention kernels not supporting their favorite attention variant :)