Distinguish extraction problem from post-processing problem #80
Labels
curation
Discussions related to the curation
need further clarification
question
Further information is requested
In order to obtain neat/ready-to-use dataset for machine-learning, from text data mining, there would be two steps.
First, the item of interest has to be properly extracted.
Second, it has to be properly post-processed.
During the curation process, I want to clearly distinguish extraction problem from post-processing problem. Even now every "status" or "error-type" will fall into either, but I want to clarify it.
Luca is already kindly performing several post-processing for extracted items. But the data are still not fully ready to use. I also want to discuss about, which part will be taken care by Luca, and which part might be our task.
I mean, every curated items will be divided into 3
It would be great if we can distinguish them during the curation. I hope we can discuss this in coming meeting.
The text was updated successfully, but these errors were encountered: