Poor F1 results of FinBERT for NER #9

jakelin212 · 2024-01-19T08:13:56Z

Hi, thanks for releasing FinBERT, we are using FinBERT (case) for NER on some unstructured Finnish medical records and have noticed some poor (F1 < 0.50) results on negative sentiment label entity, for example 'not lonely' containing texts that includes 'ei ' while the 'lonely' labels would have good F1 (~0.80) and was wondering if you have any experience or advice. It is not unbalanced since we tried doing labeling with only negative entity.

jouniluoma · 2024-01-19T08:47:08Z

Hi, I have used FinBERT for NER with good results. Without knowing more about your dataset, it is really hard to give any advice.

jakelin212 · 2024-01-23T07:00:30Z

Hi, thank you for the response and my apologies for the inaccurate issue title (I wish I can change it), you are right that for the most part, FinBERT produces very good NER results. In some cases when the negative category entries are very close to the positive, it seems like that BERT is not doing well, i.e. I have Lonely and NotLonely, yksinaisyys = lonely, but yksinaisyys ei ole ongelmia is then a not lonely, just like ei koe yksinaisyyttä and on these entries, the NotLonely is performing badly, recall and precision both ~0.50 while the Lonely is ~0.75-0.80; yes Lonely entries are much more common, but I tried a project with only NotLonely (negative) entries and it had the same effect. I think it could be the tokenisation, where the non-0 values are assigned to individual words, and the usage of 'strict' makes it worse compared to 'partial' on compute metrics. I think we will only use BERT for positive NER and then apply regular expression to assign negative categories.

jouniluoma · 2024-01-23T13:52:42Z

Named entities are usually nouns or noun phrases (something that has a name) or something that can be handled in a similar fashion. I have not really tested NER for adjectives and therefore was asking about dataset. Perhaps there is another way to solve your problem with FinBERT than NER? Is there some evidence of this kind of approach working e.g. in other languages?

jakelin212 · 2024-01-31T10:02:04Z

Thanks for your feedback, I have read that BERT does not work well with negation in English too. Feel free to close the ticket.
Best!

https://aclanthology.org/2023.blackboxnlp-1.23.pdf

Allyson Ettinger. 2020. What bert is not: Lessons from
a new suite of psycholinguistic diagnostics for language models. Transactions of the Association for
Computational Linguistics, 8:34–48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Poor F1 results of FinBERT for NER #9

Poor F1 results of FinBERT for NER #9

jakelin212 commented Jan 19, 2024

jouniluoma commented Jan 19, 2024

jakelin212 commented Jan 23, 2024

jouniluoma commented Jan 23, 2024 •

edited

Loading

jakelin212 commented Jan 31, 2024

Poor F1 results of FinBERT for NER #9

Poor F1 results of FinBERT for NER #9

Comments

jakelin212 commented Jan 19, 2024

jouniluoma commented Jan 19, 2024

jakelin212 commented Jan 23, 2024

jouniluoma commented Jan 23, 2024 • edited Loading

jakelin212 commented Jan 31, 2024

jouniluoma commented Jan 23, 2024 •

edited

Loading