Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use existing llama.cpp install #9

Open
scalar27 opened this issue Mar 24, 2024 · 7 comments
Open

use existing llama.cpp install #9

scalar27 opened this issue Mar 24, 2024 · 7 comments

Comments

@scalar27
Copy link

I've been using llama.cpp for quite a while (M1 Mac). Is there a way I can get ai_voicetalk_local.py to point to that installation instead of reinstalling it here? Sorry, newbie question...

@KoljaB
Copy link
Owner

KoljaB commented Mar 25, 2024

Just leave out step 2 of installation. I think coqui engine does not run in realtime on a Mac though.

@scalar27
Copy link
Author

I did leave out step 2 but then I get an error when I try to run:
ModuleNotFoundError: No module named 'llama_cpp'

@KoljaB
Copy link
Owner

KoljaB commented Mar 27, 2024

Python import of llama_cpp fails, that means your environment does not have working python bindings for your llama.cpp.
Please look here for Mac bindings, probably Metal (MPS).

@scalar27
Copy link
Author

Thank you. I did get it to work following your comment. Like the other M1 person, I do get stuttering. It's a shame because the voice quality is excellent and the latency is rather short. Hope a future update might solve this for us!

@scalar27
Copy link
Author

I managed to get this working with the Gemma 2 model. However, I am having trouble setting the parameters. It's working but doesn't seem optimal. I see them in creation_params.json, and also in coqui_engine.py. Would it be possible for LocalAiVoiceChat to utilize Llama.cpp's server endpoint (instead)? Or would that require a lot of rewriting of the code?

@KoljaB
Copy link
Owner

KoljaB commented Jul 17, 2024

I like that idea, I'll have to look into that.

@scalar27
Copy link
Author

Great. It seems like a more standard approach these days. I'd be happy to test whatever. As mentioned above I'm on a M1 Mac so this isn't the fastest setup but it's now working pretty well with no stuttering.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants