-
Notifications
You must be signed in to change notification settings - Fork 8
Home
The SAGA BigJob framework is a SAGA-based pilot job implementation. The Simple API for Grid Applications (SAGA) is a high-level, easy-to-use API for accessing distributed resources. SAGA BigJob supports a wide range of application types, and is usable over a broad range of infrastructures, i.e., it is general-purpose, extensible and interoperable. Unlike other common pilot job systems SAGA BigJob (i) natively supports MPI job and (ii) works on a variety of back-end systems, generally reflecting the advantage of using a SAGA-based approach. The following figure gives an overview of the SAGA BigJob architecture.
Pilot-Jobs support the decoupling of workload submission from resource assignment, this results in a flexible execution model, which in turn enables the distributed scale-out of applications on multiple and possibly heterogeneous resources. It allows the execution of jobs without the necessity to queue each individual job.
The pilot job provides a container for many sub-jobs, i.e applications submit these sub-jobs through the pilot-job and not the resource manager. A major advantage of this approach is that the waiting time at the local resource manager, which usually significantly contributes the overall time-to-completion is avoided.
An overview of the BigJob architecture can be found [here] (https://github.com/saga-project/BigJob/wiki/BigJob-Architecture).
These instructions are targeted towards experienced users and system administrators who need to install their own version of BigJob on an HPC cluster either in user or in system space. Please follow the steps below if this is what you want to do!
- Overview / Concepts
- Install & Configure SAGA
- Install & Configure Redis
- Install & Configure BigJob
- Test the Installation
A Guide for running BigJob (using SAGA CSA) on different production infrastructure can be found here.
BigJob provides a command line client (pilot-cli
) that is shipped with the
Python package. Instruction on how to use pilot-cli
can be found
here.
The BigJob Tutorial guides you through all steps necessary for installing BigJob and SAGA/Bliss. The tutorial also gives an overview about BigJob application development.
For an overview of application execution with BigJob see:
[Application Execution] (https://github.com/saga-project/BigJob/wiki/Application-Execution-and-Examples)
Below is a "living" list of example scripts that show how BigJob is used with different applications and different types of workloads. Each example focuses on a particular type of application (e.g., BFAST or AMBER), but they are often representative for a larger class of applications (e.g., single-core, multi-core, MPI, coupled, uncoupled, ...).
- [Example 1] (https://github.com/saga-project/BigJob/blob/master/examples/example_local_single.py) Running single Big-Job and a single Sub-Job on localhost.
- Example 2: Running single-core, uncoupled BFAST genome matching workloads with BigJob
- [Example 3] (https://github.com/saga-project/BigJob/blob/master/examples/example_local_single_filestaging.py) Using BigJob with File Staging. A guide describing the usage of file staging can be found here.
The primary interface exposed by BigJob is the [Pilot-API] (./wiki/Pilot-API). The Pilot-API provides a unified way for managing pilot and compute tasks, the so called compute units.
Check the [Frequently Asked Questions] (./wiki/Frequently-Asked-Questions).
For questions and comments, please join the bigjob-users group:
Subscribe to bigjob-users |
Email: |
Visit this group |