diff --git a/_posts/2024-08-07-flexattention.md b/_posts/2024-08-07-flexattention.md index 4c34879d33b6..f8103cacec18 100644 --- a/_posts/2024-08-07-flexattention.md +++ b/_posts/2024-08-07-flexattention.md @@ -131,7 +131,7 @@ Alibi is similar to relative positional encodings with one exception \- it has a alibi_bias = generate_alibi_bias() # [num_heads] def alibi(score, b, h, q_idx, kv_idx): - bias = alibi_bias[h] * (q_idx - kv_idx) + bias = alibi_bias[h] * (kv_idx - q_idx) return score + bias ``` @@ -479,4 +479,4 @@ We want to highlight some prior work (and people) that have inspired FlexAttenti - The Jax team's work on SplashAttention - Philippe Tillet and Keren Zhou for helping us with Triton - Ali Hassani for discussions on neighborhood attention -- Everybody who's complained about attention kernels not supporting their favorite attention variant :) \ No newline at end of file +- Everybody who's complained about attention kernels not supporting their favorite attention variant :)