Skip to content

Allows interweaving of arbitrary kinds of 'attention' layers, like sliding window, reuse prev layer kv cache etc. #7019

Allows interweaving of arbitrary kinds of 'attention' layers, like sliding window, reuse prev layer kv cache etc.

Allows interweaving of arbitrary kinds of 'attention' layers, like sliding window, reuse prev layer kv cache etc. #7019

Triggered via pull request June 25, 2024 00:13
@ShashankMosaicMLShashankMosaicML
synchronize #1299
Status Success
Total duration 16m 53s
Artifacts

pr-gpu.yaml

on: pull_request_target
Matrix: pytest-gpu
Fit to window
Zoom out
Zoom in

Annotations

2 warnings
gpu-2.3.0 / pytest-gpu
Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/checkout@v3, actions/setup-python@v4, actions/cache@v3. For more information see: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/.
gpu-2.3.1 / pytest-gpu
Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/checkout@v3, actions/setup-python@v4, actions/cache@v3. For more information see: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/.