-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Train quantified object recognition #84
Comments
Dear @ahmed-elyoussefi, thank you for your interest in grobid-quantities. The current implementation of the quantified object recognition uses a dependency parser and does some heuristic to find the quantified object in the sentence. I had started to use the current implementation to pre-annotate training data and then I've started annotating some documents in order to set up annotation guidelines and find problems. I didn't get too far with that actually. If you download the master branch you can pre-annotate data with all the available models using
and, for each PDF you will generate training data for each models. So to correct the quantified object data you will get a set of files Give it a try and let me know if you have any issue. Regarding the annotation, this part is very experimental because we didn't really think a lot about it. The annotation are in the form of with the The idea, at the begnning of this task, was to use CRF so is unlikely to cover distant links betweeen measurement and objects however I haven't spent enough time to figure out if some alternative exists. I hope I haven't forgotten anything, in any case feel free to ask. Regards |
@ahmed-elyoussefi FYI I've added some partially corrected examples under https://github.com/kermitt2/grobid-quantities/tree/master/resources/dataset/quantifiedObject/corpus/staging |
thank you for your replies regards |
good question... well right now you can't... cause that part haven't been written yet. Annotating data for this kind of task is very time consuming, so I haven't had time to work on it yet. I will have time at some point in the next weeks I believe but I cannot really say precisely. |
OK, @ahmed-elyoussefi I've managed to write the trainer for the quantifiedObject. 😅 It can process data, though the training data haven't been checked by anybody else... so everything is very alpha version. I'm planning to plug in the parser soon. To run the trainer you need to type This particular trainer is validating the consistency of the training data, so if there are missing links between measurement and objects it will raise an exception, like:
Happy annotating!! 😆 |
awesome |
…the interval / range in a single code base #84
Hi there
thank you for such a great project.
I have looked into the documentation to see if there is a way to train quantified objects (i know that's still in experience mode) but can any one point me to where to start training grobid-quantities to recognize the quantified object? please
The text was updated successfully, but these errors were encountered: