Problem with training LoRA for Model "TheBloke/Pygmalion-2-13B-GPTQ" #5200
-
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 7 replies
-
You can perform LoRA training on 4 bit GPTQ models, but you have to load them with the Transformers model loader, not any of the other ones. If you load the model with (e.g.) ExLlamav2_HF, you'll get that error message that you've shown here. The docs say you should tick the 'auto-devices' and 'disable_exllama' options when Loading the model with the Transformers loader in order to perform LoRA training. |
Beta Was this translation helpful? Give feedback.
-
Thank you so much :) |
Beta Was this translation helpful? Give feedback.
-
Another quick question, if I train the LoRA with Transformers and want to apply it with ExLlamav2_HF should that work or can I only use it with the Transformers Model Loader? |
Beta Was this translation helpful? Give feedback.
-
You can definitely train with Transformers and apply the resulting LoRA to the model reloaded with ExLlamav2_HF (or ExLlamav2), cause that's what I do, cause the inference is faster that way. I can only get the 4 bit GPTQ quants to work reliably though, the 8 bit ones don't seem to work that way. |
Beta Was this translation helpful? Give feedback.
You can perform LoRA training on 4 bit GPTQ models, but you have to load them with the Transformers model loader, not any of the other ones. If you load the model with (e.g.) ExLlamav2_HF, you'll get that error message that you've shown here.
The docs say you should tick the 'auto-devices' and 'disable_exllama' options when Loading the model with the Transformers loader in order to perform LoRA training.