Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implementation detail about alibi_mask #18

Open
bugm opened this issue Nov 29, 2023 · 0 comments
Open

implementation detail about alibi_mask #18

bugm opened this issue Nov 29, 2023 · 0 comments
Labels
question Further information is requested

Comments

@bugm
Copy link

bugm commented Nov 29, 2023

Hello, I am reading the code for generating alibi_mask with link https://github.com/ofirpress/attention_with_linear_biases/blob/master/fairseq/models/transformer.py

for the code in line 760 and line 761

self.alibi = self.slopes.unsqueeze(1).unsqueeze(1) * torch.arange(maxpos).unsqueeze(0).unsqueeze(0).expand(attn_heads, -1, -1) #line760
self.alibi = self.alibi.view(attn_heads, 1, maxpos) #line761
I believe we have gotten a tensor with shape (attn_heads, 1, maxpos) in line 760 already, since self.slopes.unsqueeze(1).unsqueeze(1) is a (attn_heads,1,1) tensor and torch.arange(maxpos).unsqueeze(0).unsqueeze(0).expand(attn_heads, -1, -1) is a (attn_heads,1,maxpos) tensor.
So what is the purpose of view it to (attn_heads, 1, maxpos) again?

@bugm bugm added the question Further information is requested label Nov 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant