Skip to content

Latest commit

 

History

History
44 lines (39 loc) · 2.81 KB

README.md

File metadata and controls

44 lines (39 loc) · 2.81 KB

SLURM DAG Workflow Submitter

Direct Acyclic Graph (DAG) workflow submitter for SLURM queueing system

This is workflow manager for SLURM for submitting DAG based workflows. It is a similar but a very simple implementation to HTCondor DAGMan.

To use sdag for submitting your workflow, you need to do the folowing:

  • Create a SLURM script for each workflow job.
  • Create a DAG description file for your workflow.
  • Submit your workflow by: sdag workflow-description-file

Structure

The workflow description file includes two types of statements:

  • Job description: JOBjob-name job-script-file-path. This must be provided for each job in the workflow. The keyword JOB is case sensitive, and is separated from the job script file path using space or tab.
  • Workflow: PARENTparent-jobs-list CHILD child-jobs-list. Upon the submission, each child job will be submitted with dependency on all parent jobs. A child job won't start before all parent jobs are completed, i.e. ended with exit code 0. If any of the parent jobs fails, all child jobs will be cancelled. This means that there will be no orphan jobs. see Job dependencies - SLURM. A parent/child job list must be either space or tab separated.

Guidelines

You need to follow these guidelines when using sdag:

  • You must provide a valid workflow description file. If not, sdag will return:Error: You must enter a valid DAG description file
  • You must provide a valid job description file path in each job description statement. If not, sdag will return:Error in line [XX]: XX.sbatch is not a file.
  • In case of a wrong syntax in a job description statement, sdag will return:
Error in line [XX]: A job definition statement must be written as:
JOB <job_name> <job_submission_file>
  • In case of a wrong syntax in a workflow statement, sdag will return:
Error in line [XX]: A workflow Statement must be written as:
PARENT <parent_jobs> CHILD <children_jobs>
  • In a workflow statement, if one of the child jobs is already defined in a previous workflow statement as a parent for one of the parent jobs, or one of their ancestors, sdag will return:
Error in line [XX]: Job YY Cannot be a parent for job ZZ. Job ZZ is an ancestor of job YY

Prerequisites

  • Python 2.7+
  • sbatch
  • Add sdag to PATH

Support and Bug Reports

Report an issue on the issues section or send an email to [email protected]