For text mining of biological research articles using DeepDive
Tong Shu Li and Sandip Chatterjee of the Su Lab
- Python 3.4+
- Java 1.8 for various NER/NLP annotators
- PBS/Torque cluster for cluster workflows
- Python package dependencies in
requirements.txt
lxml
may require externallibxml2
installation (using a tool likeapt-get
)
- See DeepDive main page for the latest installation instructions
- Run
bash <(curl -fsSL git.io/getdeepdive)
- Install DeepDive by selecting option from menu
- Install PostgreSQL by selecting option from menu
- On Ubuntu (14.04), run
sudo apt-get install -y python3-pip
- Install
lxml
dependencies using:sudo apt-get install -y libxml2 libxml2-dev libxslt1-dev lib32z1-dev
- Clone the repo and
cd bioshovel
- Create a virtualenv:
$ python3 -m venv venv
- Activate virtualenv:
$ source venv/bin/activate
- Install dependencies:
(venv) $ pip install -r requirements.txt
- Modules should be run from the
src
directory - Use
(venv) $ python3 -m [package_name].[module_name] [args]
- See preprocess and downloaders packages for more information
- Tests should be run from the
src
directory - Run test discovery using
python3 -m unittest