-
Notifications
You must be signed in to change notification settings - Fork 8
File Staging
Managing files in distributed systems is a tedious tasks - different paths, names, file versions complicates distributed runs. Since BigJob 0.3.38, BJ includes some basic file staging capabilities. For each bigjob created a directory with the id of the big-job is created:
<BIGJOB_WORKING_DIRECTORY>/bj-54aaba6c-32ec-11e1-a4e5-00264a13ca4c/
<BIGJOB_WORKING_DIRECTORY>/bj-3645d5e8-32ec-11e1-b346-00264a13ca4c/
<BIGJOB_WORKING_DIRECTORY>/bj-398e110a-32e9-11e1-ae24-00264a13ca4c/
Files can be staged to the BJ working directory using the filetransfer parameter of {{{start_pilot_job}}}:
bj_filetransfers = ["ssh://" + os.path.dirname(os.path.abspath(__file__))
+ "/test.txt > BIGJOB_WORK_DIR"]
bj.start_pilot_job( lrms_url,
None,
number_of_processes,
queue,
project,
workingdirectory,
userproxy,
walltime,
processes_per_node,
bj_filetransfers)
The stdout and stderr of the BJ agent is written to this directory.
For each sub-job a sub-directory is created in the directory of the parent BJ:
<BIGJOB_WORKING_DIRECTORY>/bj-54aaba6c-32ec-11e1-a4e5-00264a13ca4c/sj-55010912-32ec-11e1-a4e5-00264a13ca4c
<BIGJOB_WORKING_DIRECTORY>/bj-54aaba6c-32ec-11e1-a4e5-00264a13ca4c/sj-55153072-32ec-11e1-a4e5-00264a13ca4c
By default (i.e. if no working directory is specified in its job description), each sub-job is executed in its sub-job specific directory. If a working directory is specified, the sub-job is specified in this directory.
Files can be staged to the sub-job directory by using the filetransfer attribute:
jd = description()
jd.executable = "/bin/cat"
jd.number_of_processes = "1"
jd.spmd_variation = "single"
jd.arguments = ["test.txt"]
jd.output = "stdout.txt"
jd.error = "stderr.txt"
jd.filetransfer = ["ssh://" + os.path.dirname(os.path.abspath(__file__))
+ "/test.txt > SUBJOB_WORK_DIR"]
BigJob supports different file staging mechanisms - current SSH and Globus Online. Details on the Globus Online support can be found here.