Fix AutoModel can't load gptq model due to module prefix mismatch vs AutoModelForCausalLM #2146

LRL-ModelCloud · 2025-01-02T14:59:02Z

What does this PR do?

This PR fixes the issue encountered when using AutoModel to load the GPTQ model, which caused this error:

Traceback (most recent call last):
  File "/root/GPTQModel/test_inf.py", line 5, in <module>
    model = AutoModel.from_pretrained(model_id, revision="main")
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/transformers/src/transformers/models/auto/auto_factory.py", line 565, in from_pretrained
    return model_class.from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/transformers/src/transformers/modeling_utils.py", line 4090, in from_pretrained
    hf_quantizer.preprocess_model(
  File "/root/transformers/src/transformers/quantizers/base.py", line 194, in preprocess_model
    return self._process_model_before_weight_loading(model, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/transformers/src/transformers/quantizers/quantizer_gptq.py", line 84, in _process_model_before_weight_loading
    model = self.optimum_quantizer.convert_model(model)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/optimum/optimum/gptq/quantizer.py", line 292, in convert_model
    self.block_name_to_quantize = get_block_name_with_pattern(model)
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/optimum/optimum/gptq/utils.py", line 79, in get_block_name_with_pattern
    raise ValueError("Block pattern could not be match. Pass `block_name_to_quantize` argument in `quantize_model`")
ValueError: Block pattern could not be match. Pass `block_name_to_quantize` argument in `quantize_model`

The reason for this error is models loaded by AutoModel have different block prefixes than models loaded by AutoModelForCausalLM. For example, in the Llama model, the modules after loading with AutoModel are 'layers.0.self_attn.q_proj', 'layers.0.self_attn.k_proj', 'layers.0.self_attn.v_proj', etc. In the AutoModelForCausalLM, the modules after loading are 'model.layers.0.self_attn.q_proj', 'model.layers.0.self_attn.k_proj', 'model.layers.0.self_attn.v_proj', etc. They have different prefixes, but they correspond to the same module.

Who can review?

@Qubitium @SunMarc

SunMarc

SGTM !

Qubitium · 2025-01-03T02:01:14Z

@SunMarc We found this bug while submitting a test 1B quantized model to HF OpenLLM leaderbord

Can you notify the maintainer for the OpenLLM that the are are likely bugs in the test runners. the 1B gptq model is in queue for over 23 hours. It should have failed.

https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/add

model: https://huggingface.co/ModelCloud/Llama-3.2-1B-Instruct-gptqmodel-4bit-vortex-v1

Are the tests performed on GPU or CPU?
Will runner auto install gptqmodel or autogptq (since gptqmodel transformer has not yet been merged)

We are trying to have it test for our vortex high recovery gptq models but I don't believe existing runner will work with gptq models even if this PR is merged since it is most likely lacking autogptq (and future gptqmodel) pkgs.

SunMarc · 2025-01-03T15:11:00Z

The tests are performed on a h100 gpu and normally, if nothing changed yet, it should install autogptq. In the previous leaderboard, lots of gptq models were evaluated.

cc @alozowski do you know what is happening this is model ?

alozowski · 2025-01-06T11:45:39Z

A HF Open LLM Leaderbord maintainer here! Sorry for my late reply, indeed, our evaluation queue got stuck, but we fixed it this morning

I can confirm that we have a request file for ModelCloud/Llama-3.2-1B-Instruct-gptqmodel-4bit-vortex-v1, but it failed under the automatic evaluation unfortunately. Let me try running a manual evaluation to see how it goes

Also, feel free to open a discussion about this model in our Community section so we can discuss the model evaluation there

IlyasMoutawwakil · 2025-01-06T11:56:25Z

LGTM thanks for the fix ! will wait for GPTQ tests to pass.

HuggingFaceDocBuilderDev · 2025-01-06T12:11:23Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

IlyasMoutawwakil · 2025-01-06T13:56:42Z

Failures are irrelevant to the PR.

Qubitium · 2025-01-07T04:34:11Z

I can confirm that we have a request file for ModelCloud/Llama-3.2-1B-Instruct-gptqmodel-4bit-vortex-v1, but it failed under the automatic evaluation unfortunately. Let me try running a manual evaluation to see how it goes

@alozowski Thanks for the update. Can you confirm the failure is caused by the bug that his PR fixed? If error is unrelated to this bug fix, I will move discussion to the leaderboard community board.

Qubitium · 2025-01-10T01:27:05Z

@alozowski We need update on this. Please respond:

Is the ModelCloud/Llama-3.2-1B-Instruct-gptqmodel-4bit-vortex-v1 model failing because of this bug? We are not privy to the actual pipeline that the openllm leaderboard2 runs and only following the official doc which explicitly says to use AutoModel() api to load.

There are 0 models on the leaderboard that I can see that are gptq based. There are 2 gguf and 2 awq based on rough search so there is something with the gptq pipeline code.

We are willing to fix this end everything related to gptq testing for leaderboard, if related to HF code, if we can get some debug feedback.

alozowski · 2025-01-10T11:46:47Z

Hi @Qubitium!
No, the error on the Leaderboard was unrelated to this bug. Unfortunatelly, it was caused by incorrectly installed dependencies. I manually evaluated ModelCloud/Llama-3.2-1B-Instruct-gptqmodel-4bit-vortex-v1, and the evaluation was successful, so the results are now available on the Leaderboard. Additionally, I'm going to incorporate my changes into the Leaderboard auto-evaluation system, so all users will be able to submit their gptq models seamlessly

LRL-ModelCloud added 2 commits January 2, 2025 22:43

fix the issue of AutoModel failing to load the gptq model.

4c26632

clear

a428db2

SunMarc approved these changes Jan 2, 2025

View reviewed changes

SunMarc requested a review from IlyasMoutawwakil January 2, 2025 15:03

Qubitium approved these changes Jan 2, 2025

View reviewed changes

LRL-ModelCloud marked this pull request as ready for review January 3, 2025 01:17

update comments

d9b8f12

Qubitium approved these changes Jan 3, 2025

View reviewed changes

LRL-ModelCloud changed the title ~~Fix the issue of AutoModel failing to load the gptq model.~~ Fix AutoModel can't load gptq model due to module prefix mismatch vs AutoModelForCausalLM Jan 3, 2025

IlyasMoutawwakil merged commit 40a518b into huggingface:main Jan 6, 2025
39 of 48 checks passed

Qubitium deleted the fix-gptq-constant branch January 6, 2025 13:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix AutoModel can't load gptq model due to module prefix mismatch vs AutoModelForCausalLM #2146

Fix AutoModel can't load gptq model due to module prefix mismatch vs AutoModelForCausalLM #2146

LRL-ModelCloud commented Jan 2, 2025 •

edited

Loading

SunMarc left a comment

Qubitium commented Jan 3, 2025 •

edited

Loading

SunMarc commented Jan 3, 2025

alozowski commented Jan 6, 2025

IlyasMoutawwakil commented Jan 6, 2025

HuggingFaceDocBuilderDev commented Jan 6, 2025

IlyasMoutawwakil commented Jan 6, 2025

Qubitium commented Jan 7, 2025

Qubitium commented Jan 10, 2025

alozowski commented Jan 10, 2025

Fix AutoModel can't load gptq model due to module prefix mismatch vs AutoModelForCausalLM #2146

Fix AutoModel can't load gptq model due to module prefix mismatch vs AutoModelForCausalLM #2146

Conversation

LRL-ModelCloud commented Jan 2, 2025 • edited Loading

What does this PR do?

Who can review?

SunMarc left a comment

Choose a reason for hiding this comment

Qubitium commented Jan 3, 2025 • edited Loading

SunMarc commented Jan 3, 2025

alozowski commented Jan 6, 2025

IlyasMoutawwakil commented Jan 6, 2025

HuggingFaceDocBuilderDev commented Jan 6, 2025

IlyasMoutawwakil commented Jan 6, 2025

Qubitium commented Jan 7, 2025

Qubitium commented Jan 10, 2025

alozowski commented Jan 10, 2025

LRL-ModelCloud commented Jan 2, 2025 •

edited

Loading

Qubitium commented Jan 3, 2025 •

edited

Loading