You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here's the code that aggregates. Nothing fancy, just averaging across layers and attention heads.
If you're curious about more advanced aggregation, I started exploring learned aggregators a few months ago in MIA. If you can produce training data that links input text to output text in some meaningful way, you can train a classifier on top of the attention values to predict that relationship.
Is there somewhere a documentation about this project? How are the attention weights aggregated?
The text was updated successfully, but these errors were encountered: