Skip to content

b3092

Compare
Choose a tag to compare
@github-actions github-actions released this 05 Jun 17:04
7d1a378
CUDA: refactor mmq, dmmv, mmvq (#7716)

* CUDA: refactor mmq, dmmv, mmvq

* fix out-of-bounds write

* struct for qk, qr, qi

* fix cmake build

* mmq_type_traits