Skip to content

Deployment playbook for the Personal Cancer Genome Reporter: https://github.com/sigven/pcgr

Notifications You must be signed in to change notification settings

brainstorm/pcgr-deploy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Personal Cancer Genome Reporter deployment recipes

Introduction

Cancer reporting systems require prepopulating several gigabytes of genomic reference data and provisioning all software pieces, docker containers and configuration.

PCGR eases that, pcgr-deploy simplifies it futher.

This ansible playbook contains tasks to deploy PCGR into Amazon and OpenStack clouds, with HPC-specific tasks added as a module (mainly NFS mounting).

Quickstart

Tweak files ansible/group_vars/all and ansible.site.yml's roles section according to your needs (are you a HPC or AWS user?).

The following lines will install the deployment modules, deploy PCGR and run its built-in example as a validation:

python3 -m venv venv && source venv/bin/activate && pip install ansible
ansible-playbook aws.yaml -e 'ansible_python_interpreter=/usr/bin/python3'
ssh ubuntu@<AWS INSTANCE>
cd /mnt/pcgr
./pcgr.py --input_vcf examples/tumor_sample.COAD.vcf.gz --input_cna examples/tumor_sample.COAD.cna.tsv /mnt/pcgr-* output tumor_sample.COAD

Amazon or OpenStack or HPC?

This playbook allows for all of them, it has tested on the Australian NCI supercomputing centre Tenjin private cloud.

The only changes needed are on ansible/group_vars/all as mentioned on the Quickstart and rearranging site.yml so that it includes the hpc role after common and databundle.Then running the playbook in the following way should deploy PCGR in your (OpenStack?) VM:

ansible-playbook site.yml -e 'ansible_python_interpreter=/usr/bin/python3' -i <YOUR CLUSTER IP/HOSTNAME>,

Alternatively, if you have python3 already installed in your virtual environment, instantiating and deploying to OpenStack is as easy as:

ansible-playbook openstack.yml

Assuming you are employed by the University of Melbourne and running on Tenjin, that's all you need to do ;)

(Optional) Amazon: Saving money with Spot instances

The following script included in ansible queries AWS's spot history and determines if the instance we are asking for will be available. For instance, running the script with a 0.08AUD asking price gives us:

python ~/bin/get_spot_duration.py \
	--region ap-southeast-2 \
	--product-description 'Linux/UNIX' \
	--bids c4.large:0.08

That is 168 hours uptime at that particular asking price for ap-southeast-2c, that is ~87% savings at the time of writing this:

$ ./get_spot_duration.sh
Duration    Instance Type    Availability Zone
168.0    c4.large    ap-southeast-2c
108.2    c4.large    ap-southeast-2a
15.7    c4.large    ap-southeast-2b

Kubernetes

Open ended experiment for now, there are some errors that need some attention.

FAQ

ERROR: package is not a legal parameter in an Ansible task or handler is a symptom of a too old ansible version (probably 1.9.x). You need Ansible >=2.x to deploy this.

About

Deployment playbook for the Personal Cancer Genome Reporter: https://github.com/sigven/pcgr

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages