Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

mobiusml / hqq Public

Notifications You must be signed in to change notification settings
Fork 72
Star 719

Code
Issues 6
Pull requests 3
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Releases: mobiusml/hqq

Releases Tags

Releases · mobiusml/hqq

v0.1.5

01 Mar 10:50

mobicham

0.1.5

18af5e3

Compare

Choose a tag to compare

View all tags

v0.1.5

HQQ v0.1.5

New features

Added support for multi-gpu FSDP QLoRA training (#17)

Issues

torch.compile and the PYTORCH_COMPILE backend break with view_as_float=True. No known solution for the moment.
A bit slower inference with view_as_float=True. Solution: after training, the user can revert back to in bitpacking.

Assets 2

firengate reacted with thumbs up emoji

firengate reacted with laugh emoji

firengate reacted with hooray emoji

firengate reacted with heart emoji

firengate reacted with rocket emoji

All reactions

👍 1 reaction
😄 1 reaction
🎉 1 reaction
❤️ 1 reaction
🚀 1 reaction

1 person reacted

v0.1.4

28 Feb 09:55

mobicham

0.1.4

36e02ad

Compare

Choose a tag to compare

View all tags

v0.1.4

HQQ v0.1.4

New features

Added 1-bit support with CUDA dequant kernels.

Assets 2

firengate reacted with thumbs up emoji

All reactions

👍 1 reaction

1 person reacted

v0.1.3.post1

20 Feb 16:41

mobicham

0.1.3.post1

6e3279d

Compare

Choose a tag to compare

View all tags

v0.1.3.post1

HQQ v0.1.3.post1

New features

meta_offloading support: allows offloading meta-data to the CPU hence achieving true n-bit storage on the GPU.

Assets 2

firengate reacted with thumbs up emoji

All reactions

👍 1 reaction

1 person reacted

v0.1.3

12 Feb 16:58

mobicham

0.1.3

96ce17d

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

View all tags

v0.1.3

HQQ v0.1.3

New features

Added CUDA kernels for dequantization (up to 2-3x inference speed-up vs. Pytorch)
Added support for compute_dtype parameter (useful for float32/bfloat16 LoRA training)

Assets 2