Skip to content

SAGA Tutorial Part 1: Introduction

oleweidner edited this page Oct 26, 2012 · 30 revisions

The SAGA Python module provides an object-oriented programming interface for job submission and management, resource allocation, file handling and coordination and communication - functionality that is required in the majority of distributed applications, frameworks and tool.

The big picture looks like this, but as an application developer you don't have to worry about most of it:

SAGA Layers

SAGA encapsulates the complexity and heterogeneity of different distributed computing systems and 'cyberinfrastructures' by providing a single, coherent API to the application developer. A plug-in mechanism that is transparent to the application translates the API calls to the different middleware interfaces. Currently, the following plug-ins are available in SAGA Python:

  • Local - Allows job execution (via fork) and file handling on the local machine.
  • SSH - Allows job execution on remote hosts via SSH.
  • GSISSH - Allows job execution on remote hosts via GSISSH.
  • SFTP - Provides remote filesystem access via the SFTP protocol.
  • PBS, PBS+SSH, PBS+GSISSH - (includes TORQUE). Provides local and remote access (SSH+GSISSH) to PBS/Torque clusters.
  • SGE, SGE+SSH, SGE+GSISSH - Provides local and remote access (SSH+GSISSH) to Sun (Orcale) Grid Engine clusters.

More details about available plug-ins and how to use it can be found on the SAGA Plugins page. In part 2 of this tutorial, we will start with using the the Local job plug-in. In part 3, we will use one of the remote job submission plug-ins, either SSH, PBS+SSH or SGE+SSH, depending on the compute resources you have access to.

What will I Learn in this Tutorial?

This tutorial introduces two of SAGA's main capabilities: job creation, submission and management and data (file) handling.

You will learn how to:

  1. Install SAGA on your own machine
  2. Write a program that runs a job locally on your machine
  3. Use the same program with a different plug-in to run the job on a remote site
  4. Add file transfer capabilities to the program to retrieve results

Once you have worked your way through the examples, you will be able to write your own distributed application with SAGA Python and understand how you can run it across one or more distributed resources.

Installation

A small Python command-line tool called virtualenv allows you to create a local Python environment (sandbox) in user space, which allows you to install additional Python packages without having to be 'root'.

NOTE: SAGA-Python requires Python >= 2.5. It won't work with an older version of Python!

To create your local Python environment run the following command (you can install virtualenv on most systems via apt-get or yum, etc.):

virtualenv $HOME/tutorial

If you don't have virtualenv installed and you don't have root access on your machine, you can use the following script instead:

curl --insecure -s https://raw.github.com/pypa/virtualenv/master/virtualenv.py | python - $HOME/tutorial

Activate your Local Python Environment

You need to activate your Python environment in order to make it work. Run the command below. ('activate' will temporarily modify your PYTHONPATH so that it points to $HOME/tutorial/lib/python2.7/site-packages/ instead of the the global site-package directory):

source $HOME/tutorial/bin/activate

Activating the virtualenv is very important. If you don't activate your virtualenv, the rest of this tutorial will not work. You can usually tell that your environment is activated properly if your bash command-line prompt starts with (tutorial).

Install SAGA Python

The latest SAGA Python module is available via the Python Package Index (PyPi). PyPi packages are installed very similar to Linux deb or rpm packages with a tool called pip (which stands for pip installs packages). Pip is installed by default in your virtualenv, so in order to install SAGA Python, the only thing you have to do is this:

pip install bliss

You will see some downloading and unpacking action and if everything worked ok, the last two lines should look like this:

Successfully installed bliss paramiko-on-pypi pycrypto-on-pypi
Cleaning up...

NOTE: fatal error: Python.h: No such file or directory. If you see this error message during installation, you don't have the Python header files installed. In order to fix this, please refer to Known Problems And Solutions.

To make sure that your installation works, run the following command to check if the SAGA Python module (bliss) can be imported by the interpreter (the output should be version number of the bliss module):

python -c "import bliss; print bliss.version"

Back: [Tutorial Home](SAGA Tutorial)    Next: SAGA Tutorial Part 2: Local Job Submission