-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
0.2.9 broke the bos/eos/sys handling for chat sequences #800
Comments
verified that this PR does not fully fix the issue, leaving most users locked to version 0.2.7 |
easy to reproduce, just specify a longer chat conversation. at least 3 user and 2 assistant messages result in weird results |
@earonesty just confirming that #850 didn't resolve this? |
correct I tried that branch and while it fixes the first request it does not fix subsequent requests It's really about subsequent requests with a persistent running instance |
I suspect something about the vectors not getting cleared between requests or something is going on |
tried again today with v 0.2.13:
you can see that it starts hallucinating system messages and prompts, instead of generating responses to the "write one sentence about a silly frog". this is vicuna. version 0.2.7 just works every time. |
hmm, looks like maybe this is a problem a little deeper. i can see the engine is formatting the input correctly, and each request looks normal
it's only in subsequent requests that it starts getting odd. tried disabling caching, didn't help. eventually it starts spitting out system prompts as completions. it's like it's chaining my requests, instead of starting over. look at the output of the 3rd request.... two prompts. and each is progressive.
|
yes, that PR and the other fixed it. you now need to specify the chat_format correctly ... it won't guess anymore! |
recent changes (0.2.7 -> 0.2.9.) to the chat framework broke things (this is vicuna 7b)
first call always returns the prompt:
second call works:
third and continuing calls get odder, and show the bos tokens, etc:
trying different versions.
its a token handling issue. gonna try to figure out what broke between 0.2.7 and 0.2.9
looks like it was this:
"+- Add configurable chat formats by @abetlen in #711"
The text was updated successfully, but these errors were encountered: