Skip to content

Allows interweaving of arbitrary kinds of 'attention' layers, like sliding window, reuse prev layer kv cache etc. #7029

Allows interweaving of arbitrary kinds of 'attention' layers, like sliding window, reuse prev layer kv cache etc.

Allows interweaving of arbitrary kinds of 'attention' layers, like sliding window, reuse prev layer kv cache etc. #7029

Triggered via pull request June 25, 2024 23:18
@ShashankMosaicMLShashankMosaicML
synchronize #1299
Status Success
Total duration 9m 14s
Artifacts

pr-gpu.yaml

on: pull_request_target
Matrix: pytest-gpu
Fit to window
Zoom out
Zoom in

Annotations

2 warnings
gpu-2.3.1 / pytest-gpu
Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/checkout@v3, actions/setup-python@v4, actions/cache@v3. For more information see: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/.
gpu-2.3.0 / pytest-gpu
Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/checkout@v3, actions/setup-python@v4, actions/cache@v3. For more information see: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/.