Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-3070: Fix inconsistency between best path and scores in ViterbiDecoder #3189

Merged
merged 6 commits into from
Oct 23, 2023
Merged

GH-3070: Fix inconsistency between best path and scores in ViterbiDecoder #3189

merged 6 commits into from
Oct 23, 2023

Conversation

mauryaland
Copy link
Contributor

Related to #3070. It is the same fix used back then #949 that has been removed with the refacto. It is definitely not a good interpretation of probabilities from the Viterbi decoding but at least avoid confusion.

Some improvements could be achieved, see for example this solution from Manning stanfordnlp/stanza#744 (comment) and this CRF implementation https://github.com/RasaHQ/rasa/blob/main/rasa/utils/tensorflow/crf.py.

I would be happy to discuss more about what can be done.

@mauryaland
Copy link
Contributor Author

mauryaland commented Jun 22, 2023

Hi @alanakbik @whoisjones, any thought on this topic? I think it should be discussed regarding the consequences of the CRF scoring implementation.

@mauryaland
Copy link
Contributor Author

@alanakbik @whoisjones @helpmefindaname it could be great to discuss this topic before the release of flair 0.13. Thanks in advance!

@helpmefindaname
Copy link
Collaborator

Hi @mauryaland ,
I agree this should be part of the realse, I am currently fixing the merge conflicts etc. and then talk to @alanakbik about it.

@alanakbik
Copy link
Collaborator

Thanks @mauryaland for improving this!

Code to test:

from flair.data import Sentence
from flair.models import SequenceTagger

# example sentence
sentence = Sentence("I work for the Humboldt Universität of Berlin")

# print NER tags
tagger = SequenceTagger.load("ner-fast")
tagger.predict(sentence, return_probabilities_for_all_classes=True)
print(sentence)

# one token has a higher probability for another class pre-viterbi
difficult_token = sentence[6]

# print the problem token "of"
print(difficult_token)

# print token prediction probabilities
sorted_predictions = sorted(difficult_token.get_tags_proba_dist("ner"), key=lambda x: x.score, reverse=True)
print(sorted_predictions[:2])

Before the PR, the token "of" will have the following individual probabilities:

['Token[6]: "of"'/'O' (0.8465999960899353), 'Token[6]: "of"'/'I-ORG' (0.08910000324249268)]

After the PR, the token "of" will have the following individual probabilities:

['Token[6]: "of"'/'I-ORG' (0.8465999960899353), 'Token[6]: "of"'/'O' (0.08910000324249268)]

Since it is labeled as I-ORG after Viterbi, the score post-PR is more appropriate.

@alanakbik alanakbik merged commit c5c1bf8 into flairNLP:master Oct 23, 2023
1 check passed
@mauryaland mauryaland deleted the crf_score branch October 24, 2023 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants