Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using convert.py with a fine tuned phi-2 #5009

Closed
FiveTechSoft opened this issue Jan 17, 2024 · 10 comments
Closed

Using convert.py with a fine tuned phi-2 #5009

FiveTechSoft opened this issue Jan 17, 2024 · 10 comments
Labels

Comments

@FiveTechSoft
Copy link

FiveTechSoft commented Jan 17, 2024

We are loading phi-2 from HF using load_in_8bit=True and torch_dtype=torch.float16, then we fine tune it and finally we save it locally.

When running convert.py ./phi-2 we get this error:
File "/content/convert.py", line 764, in convert
data_type = SAFETENSORS_DATA_TYPES[info['dtype']]
KeyError: 'I8'

If we try the same using load_in_8bit=False then we get:
File "/content/convert.py", line 257, in loadHFTransformerJson
f_norm_eps = config["rms_norm_eps"],
KeyError: 'rms_norm_eps'

how to generate a GGUF from a fine tuned phi-2 ? many thanks

@FiveTechSoft FiveTechSoft changed the title Using convert.py with phi-2 Using convert.py with a fine tuned phi-2 Jan 18, 2024
@Rishbah-76
Copy link

Having the same kind of issue with falcon finetuned 7b bf16 model
""
ggllm.cpp/convert.py", line 761, in convert
data_type = SAFETENSORS_DATA_TYPES[info['dtype']]
KeyError: 'BF16'
""

@MotorCityCobra
Copy link

Do you have reason to believe it is supported?

@apepkuss
Copy link

Same issue with phi-2.

@compilade
Copy link
Collaborator

compilade commented Jan 22, 2024

how to generate a GGUF from a fine tuned phi-2 ?

I suggest using convert-hf-to-gguf.py (which is where conversion from phi-2 models was implemented) instead of convert.py.

When new models come out on HuggingFace, their conversion is usually added in convert-hf-to-gguf.py (the "hf" in there stands for HuggingFace, I think), probably because it allows sharing a lot of metadata-loading code (especially regarding tokenizers, but also tensor names). It's also more obvious (to me, at least) where to add support for a new model in convert-hf-to-gguf.py than in convert.py. So that's possibly why convert.py doesn't also support converting phi-2.

@apepkuss
Copy link

how to generate a GGUF from a fine tuned phi-2 ?

I suggest using convert-hf-to-gguf.py (which is where conversion from phi-2 models was implemented) instead of convert.py.

When new models come out on HuggingFace, their conversion is usually added in convert-hf-to-gguf.py (the "hf" in there stands for HuggingFace, I think), probably because it allows sharing a lot of metadata-loading code (especially regarding tokenizers, but also tensor names). It's also more obvious (to me, at least) where to add support for a new model in convert-hf-to-gguf.py than in convert.py. So that's possibly why convert.py doesn't also support converting phi-2.

Works for me. Thanks a lot!

@tgalery
Copy link

tgalery commented Mar 11, 2024

I have a similar issue, after fine-tuning, I get KeyError: 'n_layer' (line 1064 of convert-hf-to-gguf.py), otherwise, I get KeyError: 'rms_norm_eps' (line 257 of convert.py).

Doing some more debugging in the convert-hf-to-gguf.py file, it seems that the config I get for the Phi2 model is very different from the expected one. The config from my fine tuned model is:

{
	'_name_or_path': 'microsoft/phi-2',
	'architectures': ['PhiForCausalLM'],
	'attention_dropout': 0.0,
	'auto_map': {
		'AutoConfig': 'microsoft/phi-2--configuration_phi.PhiConfig',
		'AutoModelForCausalLM': 'microsoft/phi-2--modeling_phi.PhiForCausalLM'
	},
	'bos_token_id': 50256,
	'embd_pdrop': 0.0,
	'eos_token_id': 50256,
	'hidden_act': 'gelu_new',
	'hidden_size': 2560,
	'initializer_range': 0.02,
	'intermediate_size': 10240,
	'layer_norm_eps': 1e-05,
	'max_position_embeddings': 2048,
	'model_type': 'phi',
	'num_attention_heads': 32,
	'num_hidden_layers': 32,
	'num_key_value_heads': 32,
	'partial_rotary_factor': 0.4,
	'qk_layernorm': False,
	'resid_pdrop': 0.1,
	'rope_scaling': None,
	'rope_theta': 10000.0,
	'tie_word_embeddings': False,
	'torch_dtype': 'bfloat16',
	'transformers_version': '4.37.2',
	'use_cache': True,
	'vocab_size': 51200
}

Whereas Phi2Model.set_gguf_parameters, expects keys like n_layer, n_head, n_positions, n_embed, layer_norm_epsilon, and rotary_dim.

If it's mapping old keys to new ones, I'm happy to work on a MR, but there seems to be some info lost there.

@compilade
Copy link
Collaborator

compilade commented Mar 12, 2024

I have a similar issue, after fine-tuning, I get KeyError: 'n_layer' (line 1064 of convert-hf-to-gguf.py), otherwise, I get KeyError: 'rms_norm_eps' (line 257 of convert.py).

@tgalery The line numbers in the errors you got seem different than those in the latest commit.

Whereas Phi2Model.set_gguf_parameters, expects keys like n_layer, n_head, n_positions, n_embed, layer_norm_epsilon, and rotary_dim.

If it's mapping old keys to new ones, I'm happy to work on a MR, but there seems to be some info lost there.

The fix you're describing is already implemented, but maybe your local checkout of llama.cpp hasn't been updated in a while. Converting Phi-2 with convert-hf-to-gguf.py with such a config.json should work as of at least #4903 (which was merged on January 13), since it checks for num_hidden_layers (which seems to exist in your config.json) before falling back to n_layer.

This is what you should see in convert-hf-to-gguf.py if your source tree of llama.cpp is recent enough:

@Model.register("PhiForCausalLM")
class Phi2Model(Model):
model_arch = gguf.MODEL_ARCH.PHI2
def set_gguf_parameters(self):
block_count = self.find_hparam(["num_hidden_layers", "n_layer"])
rot_pct = self.find_hparam(["partial_rotary_factor"])
n_embd = self.find_hparam(["hidden_size", "n_embd"])
n_head = self.find_hparam(["num_attention_heads", "n_head"])

If it still doesn't work with the latest version, please do tell. Hope this helps :)

@tgalery
Copy link

tgalery commented Mar 12, 2024

Oh I see, llama.ccp was pulled inside a python repo via git submodules, will update that and try again. Many thanks.

@tgalery
Copy link

tgalery commented Mar 13, 2024

@compilade thanks for the explanation. It works like a charm. Just a question, I'm working on a pipeline and for some model types, say Mistral-7B, one needs to use the convert.py script. For some others, say phi-2, we need to use convert-hf-to-gguf.py . Is there a plan to unify these ?

@github-actions github-actions bot added the stale label Apr 13, 2024
Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants