Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]: Using flair for creating tags or keywords that represents a given text #3443

Open
B0rner opened this issue Apr 9, 2024 · 1 comment
Labels
question Further information is requested

Comments

@B0rner
Copy link

B0rner commented Apr 9, 2024

Question

Hey,
I wonder if it is possible to use one of the flair model (for example provided over huggingface) in such a way that it generates keywords from a text that is passed to the data model. Preferably in German.

I tried ner-english-ontonotes-large which works very well on German text for specific entities.

But what I need is model, that returns not specific entities, like the name of the person or the name of the city. I need a model, that returns a couple of keywords, that describes the context of the input text, like input: "I like to have pizza for breakfast." output: "breakfast", "pizza", "favorite meal"` ..which should also work for longer texts.

There is no model in the flair-oniverse, that fits exactly that usecase, right?

But is it possible, to train or fine-tune a flair-model for that? A have a huge set of data, that provides a text and related keywords, which would be ideal for finetuning a model. Is there a way (and a tutorial), how to feed the trainig-data to make the model fit for the usecase?

@B0rner B0rner added the question Further information is requested label Apr 9, 2024
@B0rner B0rner changed the title [Question]: Using flair for creating tags or keywords that repreents a given text [Question]: Using flair for creating tags or keywords that represents a given text Apr 9, 2024
@fkdosilovic
Copy link
Contributor

... I need a model, that returns a couple of keywords, that describes the context of the input text, like input: "I like to have pizza for breakfast." output: "breakfast", "pizza", "favorite meal"` ..which should also work for longer texts.

What you described is a keyword extraction task. Your best bet is to look at the KeyBERT library.

I wonder if it is possible to use one of the flair model (for example provided over huggingface) in such a way that it generates keywords from a text that is passed to the data model. Preferably in German.

If you want to "generate" your output, maybe look into KeyLLM from the KeyBERT library.

There is no model in the flair-oniverse, that fits exactly that usecase, right?

From the API docs of flair, there doesn't seem to be anything like with KeyBERT.

But is it possible, to train or fine-tune a flair-model for that? A have a huge set of data, that provides a text and related keywords, which would be ideal for finetuning a model. Is there a way (and a tutorial), how to feed the trainig-data to make the model fit for the usecase?

Maybe you can train a multi-label classifier, where inputs are documents and labels are keywords that describe best describe the document.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants