Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed Llama-3_1-Nemotron-51B doesn't work when 4K or more tokens #11008

Merged
merged 8 commits into from
Dec 31, 2024

Conversation

ymcki
Copy link
Contributor

@ymcki ymcki commented Dec 29, 2024

Make sure to read the contributing guidelines before submitting a PR

This is to fix this bug:
#11002

After inspecting the parameters between Llama-3.1-70B and the 51B ggufs while loading the ggufs with llama-cli, I noticed that there is exactly one difference at rope_theta (500000.0 vs 10000.0). Looking at config.json of 51B, this value should be 500000.0. That means the current convert_hf_to_gguf.py doesn't read rope_theta for DeciLMCausalModel. I fixed that and make this PR.

I generated an gguf with the correct rope_theta of 500000.0. It can work with llama.cpp b4380 or above without recompilation as I only fixed convert_hf_to_gguf.py without touching the C code.
https://huggingface.co/ymcki/Llama-3_1-Nemotron-51B-Instruct-GGUF/blob/main/Llama-3_1-Nemotron-51B-Instruct.imatrix.Q4_K_M.gguf

As a side, inspecting the tokenizer_config.json of Llama-3.1-70B, I find that it also have both eos_token and eot_token set to '<|eot_id|>'. Therefore, it probably is not a typo for 51B. So I also remove the four lines in set_vocab related to this.
This can get rid of this warning without causing any problems:
llm_load_vocab: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect

@github-actions github-actions bot added the python python script changes label Dec 29, 2024
@ggerganov ggerganov merged commit bc7b1f8 into ggerganov:master Dec 31, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python python script changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants