Skip to content

SAGA Tutorial Part 3: Remote Job Submission

oleweidner edited this page Oct 18, 2012 · 40 revisions

SAGA Layers

In order to add some more distribution to our example, we slightly change our previous SAGA program so that the job is executed via a different plug-in on a remote machine instead of localhost.

Prerequisites

This example assumes that you have SSH access to a remote resource, either a single host (e.g., a cloud VM) or an HPC cluster. Alternatively, you can run an SSH server on your local machine to 'emulate' a remote resource.

The example also assumes that you have a working public/private SSH key-pair and that you can log-in to your remote resource of choice using those keys, i.e., your public key is in the ~/.ssh/authorized_hosts file on the remote machine. If you are not sure how this works, you might want to read SSH and GSISSH first.

Hands-On: Remote Job Submission

Before we discuss the individual API call in more detail, let's get down and dirty and run our first example: creating and running a SAGA job on your local machine. For this purpose, create a new file saga_example_1.py and paste the following code:

import sys
import bliss.saga as saga

def main():
    try: 
        # create a job service for lonestar
        js = saga.job.Service("fork://localhost")

        # describe our job
        jd = saga.job.Description()

        jd.environment     = {'MYOUTPUT':'"Hello from Bliss"'}       
        jd.executable      = '/bin/echo'
        jd.arguments       = ['$MYOUTPUT']
        jd.output          = "my1stjob.stdout"
        jd.error           = "my1stjob.stderr"

        # create the job (state: New)
        myjob = js.create_job(jd)

        print "Job ID    : %s" % (myjob.jobid)
        print "Job State : %s" % (myjob.get_state())

        print "\n...starting job...\n"
        # run the job 
        myjob.run()

        print "Job ID    : %s" % (myjob.jobid)
        print "Job State : %s" % (myjob.get_state())

        print "\n...waiting for job...\n"
        # wait for the job to either finish or fail
        myjob.wait()

        print "Job State : %s" % (myjob.get_state())
        print "Exitcode  : %s" % (myjob.exitcode)

    except saga.Exception, ex:
        print "An error occured during job execution: %s" % (str(ex))
        sys.exit(-1)

if __name__ == "__main__":
    main()

Save the file and execute it via the python interpreter (make sure your virtualenv is activated):

python saga_example_1.py

The output should look something like this:

Job ID    : [fork://localhost]-[None]
Job State : saga.job.Job.New

...starting job...

Job ID    : [fork://localhost]-[644240]
Job State : saga.job.Job.Pending

...waiting for job...

Job State : saga.job.Job.Done
Exitcode  : None

Once the job has completed, you can have a look at the output file my1stjob.stdout.

Note: Because you're working on a local system instead of submitting to a job cluster, the job state will immediately go to "Running" instead of "Pending." This is because your machine does not have to wait to start executing the job. In a similar way, the exitcode will most likely be "0" instead of "None." This is because your machine is actually returning "0" as the exitcode, whereas some SGE clusters won't return any exitcode at all.

Details & Discussion


Back: [Tutorial Home](SAGA Tutorial)    Next: SAGA Tutorial Part 4: XYZ