Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Documentation: Map the Act Names to the Transformer #644

Open
1 task done
JuVogt opened this issue Jun 21, 2024 · 3 comments
Open
1 task done

[Proposal] Documentation: Map the Act Names to the Transformer #644

JuVogt opened this issue Jun 21, 2024 · 3 comments
Labels
complexity-moderate Moderately complicated issues for people who have intermediate experience with the code documentation Improvements or additions to documentation

Comments

@JuVogt
Copy link

JuVogt commented Jun 21, 2024

Proposal

Create a figure that maps the act names to the transformer architecture.

Motivation

Names are just conventions. I find it hard to get the exact position within the transformer block just from the act name. I.e. the resid_pre might be before the split happens or before the merge happens. So I put it in context to the other act names and work by exclusion process or modify it to see what values will change.

Pitch

I suggest using the images from the Vasvani paper and adding labeled arrows pointing to the hook positions.

Alternatives

A list or table of (act name, description) pairs.

Checklist

  • I have checked that there is no similar issue in the repo (required)
@bryce13950
Copy link
Collaborator

@JuVogt Do you have time to handle this issue?

@bryce13950 bryce13950 added documentation Improvements or additions to documentation complexity-moderate Moderately complicated issues for people who have intermediate experience with the code labels Jun 26, 2024
@tjbai
Copy link

tjbai commented Jul 23, 2024

I could put together something this week as a first PR for this project

@JuVogt
Copy link
Author

JuVogt commented Jul 30, 2024

I am willing to contribute as well, but I am currently out of time, sorry for that. I can come back after I finish my thesis at the end of the year and design something, but a first sketch would definitely help. Maybe I could then add a list with the act names including some more information about i.e. the dimensions and calculations behind it if someone already contributed a sketch or vice versa.

Also, I could add some more documentation with minimal examples beside the colabs that I think would help me in the beginning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
complexity-moderate Moderately complicated issues for people who have intermediate experience with the code documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

3 participants