Add autoawq to requirements #6533

casper-hansen · 2024-11-18T15:15:02Z

@oobabooga I have re-added autoawq to requirements because I recently overhauled the requirements for autoawq. This means the requirements are extremely minimal since we only rely on Triton kernels. We also have autoawq[kernels] for a more speedy inference solution, but I have chosen to leave it out of here due to torch having no backwards compatible support between minor versions.

Checklist:

I have read the Contributing guidelines.

Merge dev branch

oobabooga · 2024-11-18T18:27:48Z

@casper-hansen thanks for the PR. I removed AutoAWQ from the requirements because, according to the AutoAWQ Kernels README, the provided wheels are for CUDA 12.4.1, while the project currently uses CUDA 12.1:

1a870b3

Is CUDA 12.4 really necessary? Wheels are usually not backwards compatible across CUDA versions and it becomes hard to conciliate ExLlamaV2 wheels with AutoAWQ wheels in the same environment.

casper-hansen · 2024-11-18T18:46:16Z

@oobabooga I may not have perfectly communicated this - but AutoAWQ kernels are now optional and does not get installed unless you specify pip install autoawq[kernels]. It's only needed if you want faster inference, so I would prefer for superusers to go do that themselves are compile the kernels if needed.

EDIT: To answer your question more directly: yes. If you want to use the latest compiled kernels, here is the versions list with torch and CUDA. However, I am looking into options of how to distibute these kernels in such a way that is independent of CUDA and Torch version.

0.0.7: torch 2.3.x with cuda 12.1
0.0.8: torch 2.4.x with cuda 12.4
0.0.9: torch 2.5.x with cuda 12.4

oobabooga · 2024-11-18T19:09:21Z

The thing about triton is that it doesn't work on Windows, and I guess most people who install this project are on Windows (unfortunately). So to add AutoAWQ back, I'd need either autoawq-kernels wheels that work with CUDA 12.1, or to update the whole project to CUDA 12.4.

casper-hansen · 2024-11-18T19:33:54Z

Seems there is a working triton installation for Windows below. This would be preferred since Windows is not compatible with most CUDA kernels either way.
https://github.com/woct0rdho/triton-windows/tree/v3.1.x-windows

oobabooga · 2024-12-02T22:56:55Z

Having additional installation steps defeats the purpose of adding the library to the requirements.txt. I would need wheels that work with the current environment to add the requirement back. Maybe I should do that in a fork, assuming the library does in fact work with cuda 12.1.

oobabooga and others added 4 commits October 1, 2024 14:48

Merge pull request oobabooga#6421 from oobabooga/dev

3b06cb4

Merge dev branch

Merge pull request oobabooga#6422 from oobabooga/dev

d1af7a4

Merge dev branch

Merge pull request oobabooga#6491 from oobabooga/dev

cc8c7ed

Merge dev branch

Add autoawq to requirements

eeaa2a0

casper-hansen changed the base branch from main to dev November 18, 2024 15:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add autoawq to requirements #6533

Add autoawq to requirements #6533

casper-hansen commented Nov 18, 2024

oobabooga commented Nov 18, 2024

casper-hansen commented Nov 18, 2024 •

edited

Loading

oobabooga commented Nov 18, 2024

casper-hansen commented Nov 18, 2024

oobabooga commented Dec 2, 2024

Add autoawq to requirements #6533

Are you sure you want to change the base?

Add autoawq to requirements #6533

Conversation

casper-hansen commented Nov 18, 2024

Checklist:

oobabooga commented Nov 18, 2024

casper-hansen commented Nov 18, 2024 • edited Loading

oobabooga commented Nov 18, 2024

casper-hansen commented Nov 18, 2024

oobabooga commented Dec 2, 2024

casper-hansen commented Nov 18, 2024 •

edited

Loading