Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0.0 is not a valid value for the backend.
value==1.0
andvalue==0.0
are tricky values because the intent is not clear. Does the user want sampling to beon
but without modifying the temperature of the softmax? Or should sampling beoff
?Hence why we didn't allow them.
What does the OpenAI API do when these values are used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me dig in their code to make sure 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on what information that looks concrete I can quickly find via Google.
This is my understanding:
top_p=1.0
istop_p = 100%
. Is valid.top_p=0.0
is technically not possible, but seems to be treated as a very very small number instead by the OpenAPI endpoint. (Server side clamping to something like0.000000001
?)https://community.openai.com/t/what-is-the-top-p-value-range/612433
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for a late response but after a bit of reading it seems:
top_p=1
should be finetop_p=0
is not but openAI accepts it but unclear how they actually treat it server side.here are some of the other discussions:
I think given that there's no official statement from openAI I feel a bit hesitant to make an assumption on what they're doing. Even if it's most likely clamping the value to e.g.
0.000000001
Suggestion:
Allow
top_p=1
in the interest of not breaking client code, but not allowingtop_p=0
since there is not a clear interpretation on it.Would this work for you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think allowing
top_p=1.0
that should work fine for those using the guidance lib via API.As far as
top_p=0.0
, if the official API accepts this value, and you are creating an API that is supposed to be compatible, should it not accept this value, even if the actual behavior is slightly different as OpenAI doesn't say what happens for this scenario?It seems like general consensus is that OpenAI clamps it to some very small non-zero value.
Shouldn't the
TGI API
accept the same values as the officialOpenAI API
, even if we don't know whatOpenAI
is really doing inside their black-box fortop_p=0
and it seems that settingtop_p=.000_001
or similar seems to matchOpenAI
's behavior based on conversations reviewed?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
top_p=1.0 really means, no top_p, right ?
We can safely ignore the parameter. In general, I personally prefer avoiding silencing/discarding user input. If a user asks for something we should respect it, if that thing is nonsensical we should error out.
For this adaptation layer, we can definitely indulge here (even though I don't see any reason this should ever be an issue in practice, just don't set top_p if you don't want to use this filter.)
top_p=0.0 really means something impossible since you're not selecting anything. This is a user bug. Again I'd rather error out rather than doing something weird that no one has any clue about.
top_p=0.001
is effectively greedy, but there are better ways to ask for greedy (even then is it really, top_p is rarely p>0.999).Error > Undefined behavior in my book.
I'd rather not follow openai's example here, since it IS undefined behavior, it's quite probable we do things differently than them, and then people complain that we're doing things differently and we're forced between keeping existing "non openai-compliant" behavior that people might already be depending on, and "fixing" it to align and having a breaking change for something we're not responsible for (Openai could also be breaking their behavior since it doesn't seem to be defined).
Also and finally, there's a reason extreme values are an issue. We have had bug filed because models where doing crazy things with temperature=0.0001. This is almost greedy, and that was user's intention to get greedy, but t=0.0001 is not exactly greedy. The numbers are so extreme that it incurs instabilities in the numerical values, and the logits ended up messed up with NaN. So there IS a difference between doing greedy and t=0.00001.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regardless the fix for the openai layer is not here, but in the openai layer (where we can adapt values to our stricter format)