Gptq tokenized dataset #1584

SunMarc · 2023-12-11T17:20:53Z

What does this PR do ?

This PR allow to pass tokenized dataset for gptq quantization

optimum/gptq/quantizer.py

fxmarty · 2023-12-12T17:25:25Z

#1585 supersede this, right?

Co-authored-by: fxmarty <[email protected]>

SunMarc · 2023-12-12T19:25:03Z

Thanks for having a look @fxmarty ! No this is a functionality requested by @TheBloke. The quantization in transformers is quite slow compared to AutoGPTQ, maybe because of the dataset processing. So we allow tokenized dataset. My hunch personally is that with modules_in_block_to_quantize, we should have the same speed now. This is because we are quantizing one layer at a time when we don't set modules_in_block_to_quantize

TheBloke · 2023-12-12T21:28:20Z

Yeah I asked for it so I could have complete control over the dataset when using Transformers to make a GPTQ.

With AutoGPTQ, I have this control, because I can tokenise the dataset myself and then pass this to AutoGPTQ to use.

I use this to pick context-length appropriate samples. Eg for a 4096 model, I will pass 128 x 4096 token samples.

With Transformers I could never do this, I just have to pass List[str], and I wasn't sure what data exactly was being used. So I just passed 5000 x strings of various lengths.

Transformers was also much slower at making GPTQs than AutoGPTQ, and I thought these facts might be connected - although based on what Marc said here, maybe that's for other reasons?

Anyway, even if it's not the cause of the speed difference, it's great that I'll now be able to have full control over the dataset so I can ensure I send enough data for long context models, but not more than I need. And also now I can bulk tokenise the dataset myself, which I can do very fast.

SunMarc added 2 commits December 11, 2023 18:19

allow tokenized dataset

fc47275

style

c9f105b

fxmarty reviewed Dec 12, 2023

View reviewed changes

SunMarc and others added 4 commits December 12, 2023 14:18

Update optimum/gptq/quantizer.py

2c6a6f0

Co-authored-by: fxmarty <[email protected]>

Update optimum/gptq/quantizer.py

cb3803b

Co-authored-by: fxmarty <[email protected]>

Update optimum/gptq/quantizer.py

1d2439f

Co-authored-by: fxmarty <[email protected]>

Update optimum/gptq/quantizer.py

1932261

Co-authored-by: fxmarty <[email protected]>

SunMarc added 2 commits December 13, 2023 15:24

add example in docstring

d68b32b

Merge remote-tracking branch 'upstream/main' into gptq-tokenized-dataset

0951652

fxmarty approved these changes Dec 13, 2023

View reviewed changes

fxmarty merged commit afe2e3c into huggingface:main Dec 13, 2023
40 of 46 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gptq tokenized dataset #1584

Gptq tokenized dataset #1584

SunMarc commented Dec 11, 2023

fxmarty commented Dec 12, 2023

SunMarc commented Dec 12, 2023

TheBloke commented Dec 12, 2023

Gptq tokenized dataset #1584

Gptq tokenized dataset #1584

Conversation

SunMarc commented Dec 11, 2023

What does this PR do ?

fxmarty commented Dec 12, 2023

SunMarc commented Dec 12, 2023

TheBloke commented Dec 12, 2023