Allows interweaving of arbitrary kinds of 'attention' layers, like sliding window, reuse prev layer kv cache etc.#1299
Merged
ShashankMosaicML merged 86 commits intomosaicml:main from ShashankMosaicML:mixed_attention_modulesJun 30, 2024
+849-25
Commits
Commits on Jun 21, 2024
- committed
- committed
- committed
- committed
- authored
- committed
- committed
Commits on Jun 22, 2024
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
Commits on Jun 23, 2024
- committed
- authored
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
Commits on Jun 24, 2024
Commits on Jun 25, 2024
- authored
- authored
- committed
- committed
- committed
- committed
- committed
- committed
Commits on Jun 26, 2024
- committed
- committed
- authored
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- committed
- authored
- committed
Commits on Jun 27, 2024
- committed
- committed
- authored
- authored
- committed
- authored
- committed
- committed
- authored
- committed
- committed
Commits on Jun 28, 2024
Commits on Jun 29, 2024
- authored
- committed
- committed
- committed
- committed
- committed
- committed
- authored
- committed
- committed
- committed
Commits on Jun 30, 2024
- committed
- committed