-
Notifications
You must be signed in to change notification settings - Fork 3
The ZIPPY API
Wise, Aaron edited this page Feb 15, 2018
·
7 revisions
ZIPPY is largely designed to be run from the command line using json files as the source of record for an execution. However, that model doesn't always work -- what about when you want to run a very similar workflow across hundreds of different samples / runs? Well, there's an API for that:
The ZIPPY API lets you load, modify, save and execute ZIPPY parameters files. The API has a whole 4 functions:
def load_params(fname, defaults_fname=None)
"""
Loads a valid zippy parameters file or template file from disk. Represents the file as a native python object (c.f., the python json module)
"""
def save_params(params, fname)
"""
Writes a python object (structured as json) to a file. Used to write files which can then by run using 'python zippy.py your_file.json'
"""
def build_zippy(params_dictionary)
"""
Call ZIPPY from python, specifying a params dictionary instead of going through makeparams. Returns a zippy object,
which can then be run using the call x.run_zippy()
"""
def run_zippy(self, mode='sge')
"""
Method of the object returned by build_zippy. You can run either on sge or local by setting the mode.
Example: running the same zippy settings over multiple runs. Given a input map of runs, we load a base parameters file, and set the input run directory and other parameters on a per-run basis.
from zippy import zippy
for (identifier, dir_name) in run_map.iteritems():
#load template params file
params = zippy.load_params('base_params.json')
for i in range(len(stages)):
params.stages[i].output_dir = os.path.join(dir_name,params.stages[i].identifier)
#prepare the input to the data stage
params.stages[0].samples = get_samples(identifier)
params.scratch_path = '/path/to/scratch/{}'.format(identifier)
wflow = zippy.build_zippy(params)
zippy.save_params(params, 'api_params_{}.json'.format(identifier))
wflow.run_zippy()
And here is the example base_params.json:
{
"stages": [
{
"identifier": "data",
"output_dir": "",
"samples": "",
"stage": "data"
},
{
"identifier": "strelka",
"is_somatic": true,
"output_dir": "",
"previous_stage": ["data"],
"stage": "strelka",
"args": "--exome --callMemMb=1024"
}
],
"scratch_path": "",
"python": "path/to/python",
"strelka_path": "path/to/strelka",
"genome": "path/to/genome"
}