Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge changes from randaller/llama-chat #4

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Honigmelone
Copy link

Hey,

I notices the default prompt in example-chat.py was quite different from your two repos. I have merges some more recent changes from https://github.com/randaller/llama-chat to get the interactive chat working in the cpu only version.

I have not merged the model and the tokenizer yet. You might want to consider to build up on this and to merge them as well to obtain two consistent repositories

@randaller
Copy link
Owner

@Honigmelone this will break all other examples; llama-chat is now a primary repo, and this repo is deprecated

@Honigmelone
Copy link
Author

I see, is it somehow possible to run llama-chat in cpu only mode or do you drop this functionality?

@alaestor
Copy link

alaestor commented Mar 19, 2023

I haven't a clue what I'm doing and am just quickly messing around, but regarding llama-chat/llama/model.py: I changed use_gpu in def forward to False, and then all occurrences of .cuda() to .cpu() in Transformer's and Attention's inits. It just sorta... worked. Kind of. I assume it's tailored for GPU use because it's slow as heck on CPU (going from llama-cpu @ 1it/s to the bodged llama-chat's 6~8s/it with 7B on my 7950x)

Hopefully proper CPU support will come to the main repo some day.... For now I guess I'll just base my own personal experiments on this deprecated repo, or Frankenstein myself some hybrid of the two.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants