-
Notifications
You must be signed in to change notification settings - Fork 8
Home
The SAGA BigJob framework is a SAGA-based pilot job implementation. The Simple API for Grid Applications (SAGA) is a high-level, easy-to-use API for accessing distributed resources. SAGA BigJob supports a wide range of application types, and is usable over a broad range of infrastructures, i.e., it is general-purpose, extensible and interoperable. Unlike other common pilot job systems SAGA BigJob (i) natively supports MPI job and (ii) works on a variety of back-end systems, generally reflecting the advantage of using a SAGA-based approach. The following figure gives an overview of the SAGA BigJob architecture.
Pilot-Jobs support the decoupling of workload submission from resource assignment, this results in a flexible execution model, which in turn enables the distributed scale-out of applications on multiple and possibly heterogeneous resources. It allows the execution of jobs without the necessity to queue each individual job.
The pilot job provides a container for many sub-jobs, i.e applications submit these sub-jobs through the pilot-job and not the resource manager. A major advantage of this approach is that the waiting time at the local resource manager, which usually significantly contributes the overall time-to-completion is avoided.
An overview of the BigJob architecture can be found [here] (https://github.com/saga-project/BigJob/wiki/BigJob-Architecture).
If you are a user / domain scientist and you want to run your computational workload on one of the following XSEDE machines, just follow the links below. The instructions explain how you can set-up your environment on these machines with a few simple commands to use a pre-installed version of BigJob. If you are planning to use any of these machines, we highly recommend to follow these instructions - using a pre-installed version of BigJob is much simpler and less error-prone than installing your own version of BigJob!
- Lonestar (TACC)
- Kraken (NICS)
- Ranger (TACC)
- Trestles (SDSC)
- Multiple Resources
If you are a user / domain scientist and you want to run your computational workload on one of the following FutureGrid machines, just follow the links below. The instructions explain how you can set-up your environment on FutureGrid machines with a few simple commands to use a pre-installed version of BigJob.
- [Instructions: BigJob on FutureGrid] (https://github.com/saga-project/BigJob/wiki/How-to-Run-BigJob-on-FutureGrid)
In the context of the ExTENCI Project, we have developed experimental bindings that allow BigJob to access Open Science Grid's (OSG) glide-in WMS Condor pool. While these bindings are primarily used internally in application-specific Science Gateways (e.g., DARE-Cactus) to access OSG and XSEDE resources concurrently, it is possible to run BigJob directly via command-line on OSG resources.
- [Instructions: BigJob on OSG] (https://github.com/saga-project/BigJob/wiki/How-to-Run-BigJob-on-OSG)
If you are a user / domain scientist and you want to run your computational workload on one of the following LONI machines, just follow the links below. The requirement to run BigJob on LONI is to have a valid grid certificate. Once grid certificate is received on one machine, the same can be used by copying and placing on different machines of LONI. If you don't have a certificate, follow instructions at [Requesting Grid Certificate] (https://docs.loni.org/wiki/Requesting_a_LONI_Grid_Certificate).
The instructions explain how you can set-up your environment on these machines with a few simple commands.
For an overview of application execution with BigJob see:
[Application Execution] (https://github.com/saga-project/BigJob/wiki/Application-Execution-and-Examples)
Below is a "living" list of example scripts that show how BigJob is used with different applications and different types of workloads. Each example focuses on a particular type of application (e.g., BFAST or AMBER), but they are often representative for a larger class of applications (e.g., single-core, multi-core, MPI, coupled, uncoupled, ...).
- [Example 1] (https://github.com/saga-project/BigJob/blob/master/examples/example_local_single.py) Running single Big-Job and a single Sub-Job on localhost.
- Example 2: Running single-core, uncoupled BFAST genome matching workloads with BigJob
- [Example 3] (https://github.com/saga-project/BigJob/blob/master/examples/example_local_single_filestaging.py) Using BigJob with File Staging. A guide describing the usage of file staging can be found here.
These instructions are targeted towards experienced users and system administrators who need to install their own version of BigJob on an HPC cluster either in user or in system space. Please follow the steps below if this is what you want to do!
- Overview / Concepts
- Install & Configure SAGA
- Install & Configure Redis
- Install & Configure BigJob
- Test the Installation
The BigJob API defines two main class bigjob representing the pilot and subjob representing an individual task. The API doc can be found at:
The Pilot-API can be used as an alternative way for accessing BigJob:
[Pilot-API] (https://github.com/saga-project/BigJob/wiki/Pilot-API) (Alpha!)
Check the [Frequently Asked Questions] (https://github.com/saga-project/BigJob/wiki/Frequently-Asked-Questions).
For questions and comments, please join the bigjob-users group:
Subscribe to bigjob-users |
Email: |
Visit this group |