Fix embedding-based classification documentation #597
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The doc string is correct for class
EmbeddingBasedClassify
(Scores will be between 0 and 1 but do not have to add up to one.
), but not in the classification example notebookclassification.ipynb
.Two followup questions here:
prompt-based classification
is denoted assingle-label classification
, whereas theembedding-based classification
asmulti-label
. I do not see a real reason why this differentiation is made, in my opinion both approaches (log-prob scores vs. cosine similarity scores) could be used for both classification tasks (i.e., only assign the class with the highest score, vs. assigning multiple classes surpassing a to be defined threshold).prompt-based classification
scores are normalized (to sum up to 1), but those ofembedding-based classification
not?