Skip to content

v0.4.1: Gemma 2 Support, CrossEntropy Patching FIx, and GroupNorm

Compare
Choose a tag to compare
@ByronHsu ByronHsu released this 12 Nov 23:42
· 26 commits to main since this release
d784664

Highlights

  1. Gemma 2 Support: The long pending gemma 2 is finally supported thanks to @Tcc0403! He has implemented the nasty softcapping in fused linear cross entropy (#320) and discovered the convergence issue which later fixed by @ByronHsu and @Tcc0403 together. (#376)

  2. CrossEntropy Patching FIx: If you use monkey patch for CrossEntropy (Not FLCE), it is actually not patched after transformers 4.46.1. This is because CrossEntropy was replaced with F.cross_entropy in the model code. We fixed the issue in the PR (#375)

  3. GroupNorm Kernel: Our new contributor @pramodith implemented a GroupNorm kernel #375 with 2x Speedup.

What's Changed

New Contributors

Full Changelog: v0.4.0...v0.4.1