Where can I find a compatible draft model for Llama-3_1-Nemotron-51B-Instruct? #10998
-
I'm struggling to find a compatible model for this quantized model family. Original model: https://huggingface.co/nvidia/Llama-3_1-Nemotron-51B-Instruct Draft models I've tried:
Mismatch error from llama.cpp:
A pointer to a compatible smaller model, or some instructions on how to distill the Nemotron one to create my own compatible model, would be much appreciated. |
Beta Was this translation helpful? Give feedback.
Answered by
mashdragon
Dec 30, 2024
Replies: 1 comment
-
This was a GGUF config issue that appears to be fixed with #11008 |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
mashdragon
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This was a GGUF config issue that appears to be fixed with #11008