Skip to content
drelu edited this page Sep 1, 2012 · 127 revisions

BigJob Wiki

Introduction

The SAGA BigJob framework is a SAGA-based pilot job implementation. The Simple API for Grid Applications (SAGA) is a high-level, easy-to-use API for accessing distributed resources. SAGA BigJob supports a wide range of application types, and is usable over a broad range of infrastructures, i.e., it is general-purpose, extensible and interoperable. Unlike other common pilot job systems SAGA BigJob (i) natively supports MPI job and (ii) works on a variety of back-end systems, generally reflecting the advantage of using a SAGA-based approach. The following figure gives an overview of the SAGA BigJob architecture.

Pilot-Job

Pilot-Jobs support the decoupling of workload submission from resource assignment, this results in a flexible execution model, which in turn enables the distributed scale-out of applications on multiple and possibly heterogeneous resources. It allows the execution of jobs without the necessity to queue each individual job.

Why do you need pilot-jobs?

The pilot job provides a container for many sub-jobs, i.e applications submit these sub-jobs through the pilot-job and not the resource manager. A major advantage of this approach is that the waiting time at the local resource manager, which usually significantly contributes the overall time-to-completion is avoided.

BigJob Architecture

An overview of the BigJob architecture can be found [here] (https://github.com/saga-project/BigJob/wiki/BigJob-Architecture).

How to Install Your Own Version of BigJob

These instructions are targeted towards experienced users and system administrators who need to install their own version of BigJob on an HPC cluster either in user or in system space. Please follow the steps below if this is what you want to do!

  1. Overview / Concepts
  2. Install & Configure SAGA
  3. Install & Configure Redis
  4. Install & Configure BigJob
  5. Test the Installation

How to Run BigJob on Production Infrastructures

A Guide for running BigJob (using SAGA CSA) on different production infrastructure can be found here.

BigJob Command Line Client

BigJob provides a command line client (pilot-cli) that is shipped with the Python package. Instruction on how to use pilot-cli can be found here.

Getting Started - BigJob Tutorial

The BigJob Tutorial guides you through all steps necessary for installing BigJob and SAGA/Bliss. The tutorial also gives an overview about BigJob application development.

BigJob Application Execution and Examples

For an overview of application execution with BigJob see:

[Application Execution] (https://github.com/saga-project/BigJob/wiki/Application-Execution-and-Examples)

Below is a "living" list of example scripts that show how BigJob is used with different applications and different types of workloads. Each example focuses on a particular type of application (e.g., BFAST or AMBER), but they are often representative for a larger class of applications (e.g., single-core, multi-core, MPI, coupled, uncoupled, ...).

API

The primary interface exposed by BigJob is the [Pilot-API] (./wiki/Pilot-API). The Pilot-API provides a unified way for managing pilot and compute tasks, the so called compute units.

API Documentation

Where to Get Help?

Check the [Frequently Asked Questions] (./wiki/Frequently-Asked-Questions).

For questions and comments, please join the bigjob-users group:

Google Groups
Subscribe to bigjob-users
Email:
Visit this group
***