Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[
Mamba2
] Fix caching, slow path, and multi-gpu (#35154)
* fixup mamba2 - caching and several other small fixes * fixup cached forward * correct fix this time * fixup cache - we do not need to extend the attn mask it's handled by generate (gives total ids + mask at each step) * remove unnecessary (un)squeeze * fixup cache position * simplify a few things * [run-slow] mamba2 * multi gpu attempt two * [run-slow] mamba2 * [run-slow] mamba2 * [run-slow] mamba2 * [run-slow] mamba2 * add newer slow path fix * [run-slow] mamba2
- Loading branch information