Allows interweaving of arbitrary kinds of 'attention' layers, like sliding window, reuse prev layer kv cache etc. #7019
Triggered via pull request
June 25, 2024 00:13
ShashankMosaicML
synchronize
#1299
Status
Success
Total duration
16m 53s
Artifacts
–
Annotations
2 warnings
gpu-2.3.0 / pytest-gpu
Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/checkout@v3, actions/setup-python@v4, actions/cache@v3. For more information see: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/.
|
gpu-2.3.1 / pytest-gpu
Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/checkout@v3, actions/setup-python@v4, actions/cache@v3. For more information see: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/.
|