Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch model to Llama 3 #17

Open
mirix opened this issue Oct 25, 2024 · 1 comment
Open

Switch model to Llama 3 #17

mirix opened this issue Oct 25, 2024 · 1 comment

Comments

@mirix
Copy link

mirix commented Oct 25, 2024

Thanks for this great project. It works quite well.

However, I would like to change the model to:

https://huggingface.co/mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated-GGUF

I have been trying to adapt the prompt template. It kind of works, even without changing the template, but eventually it goes into an endless response loop. I guess that the issue are the stop tokens or something like that.

Has anyone tried this?

@KoljaB
Copy link
Owner

KoljaB commented Oct 25, 2024

Quite sure you are right, the reason will most probably be the different prompt formats ("chat templates") the models use. This project is a bit outdated and therefore still uses the raw template format.

Meanwhile most of this can be abstracted away. Today most LLM providers offer servers with chat endpoints, like these:

Or you can directly use huggingface transformers.pipeline, which abstracts the model chat template away.

If you want to use the chat endpoints, the LocalEmotionalAIVoiceChat project has example code that does it this way.

Alternatively you can also use then python OpenAI library to wrap the base endpoints (http://localhost:1234/v1). For example code, Linguflex project does some LLM requests in this way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants