-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add autoawq to requirements #6533
base: dev
Are you sure you want to change the base?
Conversation
Merge dev branch
Merge dev branch
Merge dev branch
@casper-hansen thanks for the PR. I removed AutoAWQ from the requirements because, according to the AutoAWQ Kernels README, the provided wheels are for CUDA 12.4.1, while the project currently uses CUDA 12.1: Is CUDA 12.4 really necessary? Wheels are usually not backwards compatible across CUDA versions and it becomes hard to conciliate ExLlamaV2 wheels with AutoAWQ wheels in the same environment. |
@oobabooga I may not have perfectly communicated this - but AutoAWQ kernels are now optional and does not get installed unless you specify EDIT: To answer your question more directly: yes. If you want to use the latest compiled kernels, here is the versions list with torch and CUDA. However, I am looking into options of how to distibute these kernels in such a way that is independent of CUDA and Torch version.
|
The thing about triton is that it doesn't work on Windows, and I guess most people who install this project are on Windows (unfortunately). So to add AutoAWQ back, I'd need either |
Seems there is a working triton installation for Windows below. This would be preferred since Windows is not compatible with most CUDA kernels either way. |
Having additional installation steps defeats the purpose of adding the library to the |
@oobabooga I have re-added autoawq to requirements because I recently overhauled the requirements for autoawq. This means the requirements are extremely minimal since we only rely on Triton kernels. We also have
autoawq[kernels]
for a more speedy inference solution, but I have chosen to leave it out of here due to torch having no backwards compatible support between minor versions.Checklist: