Skip to content

v0.1.3

Compare
Choose a tag to compare
@mobicham mobicham released this 12 Feb 16:58
· 260 commits to master since this release
96ce17d

HQQ v0.1.3

New features

  • Added CUDA kernels for dequantization (up to 2-3x inference speed-up vs. Pytorch)
  • Added support for compute_dtype parameter (useful for float32/bfloat16 LoRA training)