Skip to content

Allows interweaving of arbitrary kinds of 'attention' layers, like sliding window, reuse prev layer kv cache etc. #7028

Allows interweaving of arbitrary kinds of 'attention' layers, like sliding window, reuse prev layer kv cache etc.

Allows interweaving of arbitrary kinds of 'attention' layers, like sliding window, reuse prev layer kv cache etc. #7028

Triggered via pull request June 25, 2024 19:44
@ShashankMosaicMLShashankMosaicML
synchronize #1299
Status Success
Total duration 8m 47s
Artifacts

pr-gpu.yaml

on: pull_request_target
Matrix: pytest-gpu
Fit to window
Zoom out
Zoom in

Annotations

2 warnings
gpu-2.3.1 / pytest-gpu
Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/checkout@v3, actions/setup-python@v4, actions/cache@v3. For more information see: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/.
gpu-2.3.0 / pytest-gpu
Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/checkout@v3, actions/setup-python@v4, actions/cache@v3. For more information see: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/.