Review pdf2xml, crossref and CCC RightsLink approaches to providing full text for text mining #202

rkboyce · 2016-12-31T15:59:52Z

There has been considerable progress in the publishing community for supporting text mining of full text articles. We need to consider how these are relevant for the current NLM R01 future and to further enhancements to AnnotationPress. Here are some things to pay attention to:

Crossref provides an API (https://github.com/CrossRef/rest-api-doc/blob/master/rest_api.md) that is oriented towards helping identify the rights for full text and even the location of PDF or XML documents: https://www.youtube.com/watch?v=LBYgq6jPoyk&feature=youtu.be. There is some important background info on crossref here: https://www.youtube.com/watch?v=YPCRfNFJgj8
RightFind is the copyright clearance center's new solution for helping researchers find XML versions of full text for text mining purposes, along with knowledge of the rights they have to work with those documents: https://www.youtube.com/watch?v=-gUhAkwZbVQ
pdf2xml seems to be a highly preferred approach by the text mining community for working with PDF content. We need to think about how annotations created in AnnotationPress using PDF documents can be translated to the equivalent XML versions of the documents because that will be very useful for text miners.

rkboyce added the question label Dec 31, 2016

rkboyce added this to the Release 6: Linking evidence across claims milestone Dec 31, 2016

rkboyce assigned rkboyce, ningyifan and wenzhang61 Dec 31, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Review pdf2xml, crossref and CCC RightsLink approaches to providing full text for text mining #202

Review pdf2xml, crossref and CCC RightsLink approaches to providing full text for text mining #202

rkboyce commented Dec 31, 2016

Review pdf2xml, crossref and CCC RightsLink approaches to providing full text for text mining #202

Review pdf2xml, crossref and CCC RightsLink approaches to providing full text for text mining #202

Comments

rkboyce commented Dec 31, 2016