Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add autoawq to requirements #6533

Open
wants to merge 4 commits into
base: dev
Choose a base branch
from

Conversation

casper-hansen
Copy link
Contributor

@oobabooga I have re-added autoawq to requirements because I recently overhauled the requirements for autoawq. This means the requirements are extremely minimal since we only rely on Triton kernels. We also have autoawq[kernels] for a more speedy inference solution, but I have chosen to leave it out of here due to torch having no backwards compatible support between minor versions.

image

Checklist:

@casper-hansen casper-hansen changed the base branch from main to dev November 18, 2024 15:15
@oobabooga
Copy link
Owner

@casper-hansen thanks for the PR. I removed AutoAWQ from the requirements because, according to the AutoAWQ Kernels README, the provided wheels are for CUDA 12.4.1, while the project currently uses CUDA 12.1:

1a870b3

Is CUDA 12.4 really necessary? Wheels are usually not backwards compatible across CUDA versions and it becomes hard to conciliate ExLlamaV2 wheels with AutoAWQ wheels in the same environment.

@casper-hansen
Copy link
Contributor Author

casper-hansen commented Nov 18, 2024

@oobabooga I may not have perfectly communicated this - but AutoAWQ kernels are now optional and does not get installed unless you specify pip install autoawq[kernels]. It's only needed if you want faster inference, so I would prefer for superusers to go do that themselves are compile the kernels if needed.

EDIT: To answer your question more directly: yes. If you want to use the latest compiled kernels, here is the versions list with torch and CUDA. However, I am looking into options of how to distibute these kernels in such a way that is independent of CUDA and Torch version.

  • 0.0.7: torch 2.3.x with cuda 12.1
  • 0.0.8: torch 2.4.x with cuda 12.4
  • 0.0.9: torch 2.5.x with cuda 12.4

@oobabooga
Copy link
Owner

The thing about triton is that it doesn't work on Windows, and I guess most people who install this project are on Windows (unfortunately). So to add AutoAWQ back, I'd need either autoawq-kernels wheels that work with CUDA 12.1, or to update the whole project to CUDA 12.4.

@casper-hansen
Copy link
Contributor Author

Seems there is a working triton installation for Windows below. This would be preferred since Windows is not compatible with most CUDA kernels either way.
https://github.com/woct0rdho/triton-windows/tree/v3.1.x-windows

@oobabooga
Copy link
Owner

Having additional installation steps defeats the purpose of adding the library to the requirements.txt. I would need wheels that work with the current environment to add the requirement back. Maybe I should do that in a fork, assuming the library does in fact work with cuda 12.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants