Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix embedding-based classification documentation #597

Merged
merged 2 commits into from
Apr 16, 2024

Conversation

SebastianZettAA
Copy link
Contributor

The doc string is correct for class EmbeddingBasedClassify (Scores will be between 0 and 1 but do not have to add up to one.), but not in the classification example notebook classification.ipynb.

Two followup questions here:

  • The prompt-based classification is denoted as single-label classification, whereas the embedding-based classification as multi-label. I do not see a real reason why this differentiation is made, in my opinion both approaches (log-prob scores vs. cosine similarity scores) could be used for both classification tasks (i.e., only assign the class with the highest score, vs. assigning multiple classes surpassing a to be defined threshold).
  • Is there a reason why the prompt-based classification scores are normalized (to sum up to 1), but those of embedding-based classification not?

@SebastianZettAA SebastianZettAA self-assigned this Mar 8, 2024
@NiklasKoehneckeAA NiklasKoehneckeAA force-pushed the fix-embedding-classification-docu branch from f9269f5 to 83131ab Compare March 28, 2024 09:15
@MerlinKallenbornTNG
Copy link
Contributor

Aloha,
what's the status of this PR? Looks like nothing happened since two weeks. Can it be merged or has it become obsolete?

@SebastianZettAA
Copy link
Contributor Author

I opened the PR when trying to use the classification task(s) for a client project when I still was on the customer team. At that time I did not feel responsible nor entitled to work more directly on the IL than asking above questions.
Since I was told @NickyHavoc primarily worked on this topic I suggest we talk today or tomorrow about this in person?

@FlorianSchepersAA FlorianSchepersAA force-pushed the fix-embedding-classification-docu branch 2 times, most recently from 57e0d29 to 48f4f63 Compare April 16, 2024 07:07
@FlorianSchepersAA FlorianSchepersAA merged commit bdb5bf8 into main Apr 16, 2024
4 checks passed
@FlorianSchepersAA FlorianSchepersAA deleted the fix-embedding-classification-docu branch April 16, 2024 07:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants