Workshop on 10 Dec 2015, 09:30-16:30, with optional hackathon at EMBL-EBI on 11 Dec 2015 (separate booking required)
Tickets available here.
Content mining technologies hold much potential for maximising scientific discovery and the reuse of research through automatedly searching, indexing and analysing the scientific literature. This workshop aims to educate and engage researchers who are interested in using mining technologies in the biomedical and life sciences.
You will receive training in retrieving full-text articles and automatically extracting useful facts before optionally having the opportunity to apply these skills to discovering genomic datasets in a hackathon on Dec 11 hosted by EMBL-EBI and DNA Digest, a Cambridge-based charity empowering efficient access and sharing of genomics data. Advice and some assistance with transport is available, places for the hackathon can be booked here.
The workshop is best suited to those with an awareness of using command line tools or enthusiasm to pick this up, but formal programming experience is not a requirement.
This workshop is offered at a heavily subsidised rate of £10 including lunch and refreshments and places are limited to 25. If you have any queries please email [email protected].
- Start: 10 December 2015 9:30 am
- End: 10 December 2015 4:30 pm
- Venue: The Hauser Forum, Seminar Room 1, 3 Charles Babbage Road, Cambridge, CB3 0GT - view on map
- Tickets available here.
- Facilitators:
- Peter Murray-Rust (@petermurrayrust)
- Jenny Molloy (@jenny_molloy )
- Language: English
- Pad
- ContentMine.org
- hashtag: #contentminelife
Please take a few minutes and fill out our evaluation form after the workshop.
ContentMine
Copyright
Copyright-holder for all works is the Shuttleworth Foundation.
- License for text, slides and images: CC BY 4.0
- License for code: MIT
Welcome!
Time | Agenda |
---|---|
9:30 AM - 10:00 AM | Introduction |
10:00 AM - 10:30 AM | Legal & Responsible Mining |
10:30 AM - 10:50 AM | Coffee Break |
10:50 AM - 11:00 AM | How to work with ContentMine tools |
11:00 AM - 11:45 AM | Building a corpus with getpapers |
11:45 AM - 12:15 AM | sHTML and normalization |
12:15 AM - 1:00 PM | extracting facts with AMI |
1:00 PM - 2:00 PM | Lunch Break |
2:00 PM - 2:15 PM | more on extracting facts with AMI |
2:15 PM - 3:00 PM | on demand: regular expressions with ami-regex |
3:00 PM - 3:10 PM | Coffee Break |
3:10 PM - 3:55 PM | Facts in context with Jupyter notebooks |
3:55 PM - 4:20 PM | free exploration with help |
4:20 PM - 4:30 PM | Wrap-up |
We're happy you're thinking about contributing to ContentMine!
There are many ways to contribute:
- by reporting an issue regarding software or training
- by starting your own community
- by suggesting new features
- by writing code and documentation
- by closing issues
- by writing about the project
If you have questions, ask us directly at DNAdigest Hackday or write our training managers a mail (mail ett stefankasberger dot at, web ett christopherkittel dot eu).
When you are online, you can find us:
- contentmine.org
- @thecontentmine
- office ett contentmine dot org
- workshop Resources: All resources for the ContentMine software toolchain - from getpapers and quickscrape over norma and AMI, this repository is the central source of tutorials for the ContentMine software pipeline.
- Zotero: Public group with a collection of scientific papers and magazine articles relating to text data mining, copyright and ContentMine