{% hint style="info" %} If you'd prefer to see the following instructions in a video, we have a tutorial on batch job submission on Oscar. {% endhint %}
To run a batch job on Oscar, you first have to write a script that describes what resources you need and how your program will run. Some example batch scripts are available in your home directory on Oscar, in the directory:
~/batch_scripts
A batch script starts by specifying the bash
shell as its interpreter with the line:
#!/bin/bash
By default, a batch job will reserve 1 core and 2.8GB of memory per core for your job. You can customize the amount of resources allocated for your job by explicitly requesting them in your batch script with a series of lines starting with #SBATCH
, e.g.,
#SBATCH -n 4
#SBATCH --mem=16G
#SBATCH -t 1:00:00
The above lines request 4 cores (-n
), 16GB of memory per node (--mem
), and one hour of runtime (-t
). After you have described the resources you want allocated for the job, you then give the commands that you want to be executed.
{% hint style="warning" %}
All of the #SBATCH
instructions in your batch script must appear before the commands you want to run.
{% endhint %}
Once you have your batch script, you can submit a batch job to the queue using the sbatch
command:
sbatch <jobscript>
As an alternative to requesting resources within your batch script, it is possible to define the resources requested as command-line options to sbatch
. For example, the command below requests 4 cores (-n
), 16GB of memory per node (--mem
), and one hour of runtime (-t
) to run the job defined in the batch script.
sbatch -n 4 -t 1:00:00 --mem=16G <jobscript>
Note that command-line options passed to sbatch
will override the resources specified in the script, so this is a handy way to reuse an existing batch script when you just want to change a few of the resource values.
The sbatch
command will return a number, which is your Job ID. You can view the output of your job in the file slurm-<jobid>.out
in the directory where you invoked the sbatch
command. For instance, you can view the last 10 lines of output with:
tail -10 slurm-<jobid>.out
Alternatively, you can mention the file names where you want to dump the standard output and errors using the -o
and -e
flags. You can use %j
within the output/error filenames to add the id of the job. If you would like to change your output file to be MyOutput-<job-id>, you can add the following line to your batch job:
#SBATCH -o my-output-%j.out
A full description of all of the options forsbatch
can be found online or by using the following command on Oscar:
$ man sbatch
The table below summarizes some of the more useful options forsbatch
.
option | purpose |
---|---|
-J |
Specify the job name that will be displayed when when listing the job |
-n |
Number of tasks (= number of cores, if "--cpus-per-task" or "-c" option is not mentioned) |
-c |
Number of CPUs or cores per task (on the same node) |
-N |
Number of nodes |
-t |
Runtime, as HH:MM:SS |
--mem= |
Requested memory per node |
-p |
Request a specific partition |
-o |
Filename for standard output from the job |
-e |
Filename for standard error from the job |
-C |
Add a feature constraint (a tag that describes a type of node). You can also select multiple feature constraints using ' |
--mail-type= |
Specify the events that you should be notified of by email: BEGIN, END, FAIL, REQUEUE, and ALL |
--mail-user= |
Email ID where you should be notified |
When a user logs into Oscar, there are pre-set environment variables such as HOME, which are the user's login environment variables. A user may modify an existing enviornmet variable, or add a new environment variable. So when a user submits a slurm batch job, the user's current environment variables may differ from the user's login environment. By default, a user's current environment variables, instead of the user's login environment variables, are accessible to the user's batch jobs on Oscar.
To modify or add an environment variable, run the following command:
- run the following command in your shell
export my_variable=my_value
- or have the following line in your batch script
#SBATCH --export=my_variable=my_value
After the step above to modify or add an environment variable, your batch job can access the environment variable my_variable whose value is my_value.
To export more than one environment variables, just list all the name=value pairs separated by commas:
#SBATCH --export=my_variable1=my_value1,my_variable2=my_value2,my_variable3=my_value3
Here is an example that a batch script loops over an input file and submits a job for each directory in the input file, where a directory is passed to a batch job for processing.
The input file test.txt has multiple lines where each line is a directory:
/users/yliu385/data/yliu385/Test/
/users/yliu385/data/yliu385/Test/pip
/users/yliu385/data/yliu385
The loop.sh script reads each line (directory) from the input file and passes the directory as an environment variable to a batch job:
#!/bin/bash
if [ "$#" -ne 1 ] || ! [ -f "$1" ]; then
echo "Usage: $0 FILE"
exit 1
fi
while IFS= read -r line; do
sbatch --export=directory=$line test.job
done < $1
The test.job is a job script, which runs the test.sh to process the directory passed as an environment variable:
#!/bin/sh
#SBATCH -N 1
#SBATCH -n 1
./test.sh $directory
The test.sh is a bash script which simply echoes the directory:
#!/bin/bash
echo "$0 argument: $1"
If you run ./loop.sh
, then three jobs are submitted. Each job generates an output like the following:
/users/yliu385/data/yliu385/Test/
./test.sh argument: /users/yliu385/data/yliu385/Test/
Variables can be passed at the sbatch command line to set the job name, output and error file names, as shown in the following example:
t=`date +"%Y-%m-%d"`
sbatch --job-name=test.$t --output=test.out.$t --error=test.err.$t test.job