-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TEMP FIX] Ollama / llama.cpp: cannot find tokenizer merges in model file [duplicate] #1062
Comments
Oh yep the issue is on transformers 4.45 - I'm communicating with them to fix the problem |
my version in notebook -> transformers 4.44.2 (the same as the last week, when was working), but I had the same problem. |
If you're installing Unsloth right from git like Here is how I worked around this bug: First I manually installed Transformers:
And then I installed Unsloth using the commit hash for the September 2024 release: The September 24 release of Unsloth is only requiring Transformers 4.44, so it does not attempt to upgrade the installation of Transformers installed in the first step. |
But in this case you need to perform a new training or you only have to regenerate gguf? |
It looks like its a problem with the tokenizer thats being exported by unsloth :(
|
Is there any workaround for this ? I am blocked. It was working fine just 3 days ago. |
I also tried your suggestion but it still fails
Does the merged F16 version work, either raw or gguf? |
My workflow involves saving the model using This has been working fine for me with my workaround to lock down the version of Unsloth to the September release. Do be sure you're also manually installing Transformers, and specifying version 4.44.2. Otherwise if you just EDIT: If it helps, here is the full text of the first cell of my notebook:
It's the combination of manually installing Transformers and installing Unsloth from a specific hash, that are the keys to working around this for the time being. |
Thanks, this worked. I hope they fix it properly. |
Extreme apologies on the delay - was out for a few days - will get to the bottom of this and fix this asap - apologies again! |
@danielhanchen thank you. eagerly waiting for your fix to resume my work. <3 |
So it seems #1065 and this are identical as well - I will update both threads |
I just communicated with the Hugging Face team - they will upstream updates to I re-uploaded all Please try again and see if it works! This unfortunately means you need to re-finetune the model if you did not save the 16bit merged weights or LoRAs. Extreme apologies if you did, update unsloth then reload them and save them to GGUF. Update Unsloth via:
I will update everyone once the Hugging Face team resolves the issue! Sorry again! Pinging: @jwhitehorn @xmaayy @avvRobertoAlma @nullnuller @DiLiuNEUexpresscompany @laoc81 @Mukunda-Gogoi |
thanks @danielhanchen ! let me try it out! |
@drsanta-1337 If you already updated - you might have to do it again sorry!! I just pushed changes into main |
no probs! |
Thanks @danielhanchen, and sorry for the disturbances; to give the context as to what is happening here, we updated the format of merges serialization in The change was done to be backwards-compatible : However, it could not be forwards-compatible: if a file is serialized with the new format, older versions of This is why we're seeing this issue: new files are serialized using the new version, and these files are not loadable in llama.cpp, yet. We're updating all other codepaths (namely llama.cpp) to adapt to the new version. Once that is shipped, all your trained checkpoints will be directly loadable as usual. We're working with llama.cpp to ship this as fast as possible. Thank you! Issue tracker in llama.cpp: ggerganov/llama.cpp#9692 |
@danielhanchen The fix is working now thanks, I fine-tuned, generated Q5_KM GGUF of llama-3.2-1B-Instruct and ran it using ollama thanks a lot I'm unblocked now! |
Hi, i tried finetuning both llama 3.1-8b-instruct and llama 3-8b-instruct following the notebook you provided here.
The training phase completed without errors and i generated the gguf quantized at 8-bit.
However i cannot load the gguf in LLM Studio for this error:
"llama.cpp error: 'error loading model vocabulary: cannot find tokenizer merges in model file\n'"
Did you have this kind of problem ?
I finetuned with success both mistral-instruct and mistral-small-instruct without problems. I'm experiencing issues only with llama
The text was updated successfully, but these errors were encountered: