Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Expand quantization model support #684

Open
miguel-kjh opened this issue Jul 26, 2024 · 1 comment
Open

[Proposal] Expand quantization model support #684

miguel-kjh opened this issue Jul 26, 2024 · 1 comment
Labels
complexity-high Very complicated changes for people to address who are quite familiar with the code

Comments

@miguel-kjh
Copy link

Why does Transformer Lens only support quantized LLaMA models?

Hi everyone,

I'm trying to use the transformer_lens library to study the activations of a quantized Mistral 7B model (unsloth/mistral-7b-instruct-v0.2-bnb-4bit). However, when I try to load it, I encounter a problem.

This is the code I'm using:

model_merged = model.merge_and_unload()
model_hooked = transformer_lens.HookedTransformer.from_pretrained(
    "unsloth/mistral-7b-instruct-v0.2-bnb-4bit",
    hf_model=model_merged, 
    hf_model_4bit=True, 
    fold_ln=False, 
    fold_value_biases=False, 
    center_writing_weights=False, 
    center_unembed=False, 
    tokenizer=tokenizer
)

The problem is that I get an assertion error stating that only LLaMA models can be used in quantized format with this library. This is the error message I receive:

---------------------------------------------------------------------------
AssertionError  Traceback (most recent call last)
AssertionError: Quantization is only supported for Llama models

I find it illogical and frustrating that only LLaMA models are compatible with transformer_lens in quantized format. Can anyone explain why this decision was made? Is there a technical reason behind this or any way to work around this issue so that I can use my Mistral 7B model?

I appreciate any guidance or solutions you can provide.

Thanks!

@bryce13950 bryce13950 changed the title [Question] Why does Transformer Lens only support quantized LLaMA models? [Proposal] Expand quantization model support Nov 3, 2024
@bryce13950 bryce13950 added the complexity-high Very complicated changes for people to address who are quite familiar with the code label Nov 3, 2024
@bryce13950
Copy link
Collaborator

It was not done due to the person who added it being a volunteer. We can definitely put expanding this on the list of todos.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
complexity-high Very complicated changes for people to address who are quite familiar with the code
Projects
None yet
Development

No branches or pull requests

3 participants
@bryce13950 @miguel-kjh and others