Skip to content

Job Submission

There are two basic ways to submit jobs to SLURM:

  1. Interactive - In this method you get an allocation of resources and then login to the node directly and use it interactivly
  2. Batch - In this method you submit you job with a job script defining all of the steps you want to accomplish. This method is best for mass quantities of jobs that you have already tested.

Interactive Job Submission

salloc runs an interactive job on the cluster. You can request a shell to run commands or submit a script or app interactively.

Example usage:

[smy190@ip-10-37-171-122 ~]$ salloc --nodes=1 --ntasks=1 --cpus-per-task=8 --time=1:00:00 --partition=g6-xl --gres=gpu:1 --mem=16GB --job-name=interactive-test
salloc: Granted job allocation 464
salloc: Waiting for resource configuration

Note: On the AWS Cluster, nodes have to be provisioned behind the scenes before the job can be dispatched. This can take ~5 min.

The command line arguments describe the job configuration, and are listed in detail on the slurm website.

See full salloc docs on the slurm website

Batch Job Submission

sbatch submits a script to be run as a batch job on the cluster. This allocates a job, similar to salloc, but you do not get a shell into the job and cannot (easily) control the job once started.

Before running sbatch, you must define a script. E.g. nano my_slurm.job. Below is an example job script.

Job Script Example

Note: There are may useful SLURM environment variables. Consider using these in your job script.

#!/bin/bash
#SBATCH --job-name=my_job          # Job name
#SBATCH --output=%x_%j.o           # Output file (%x expands to Job name, %j expands to job ID)
#SBATCH --error=%x_%j.e            # Error file
#SBATCH --ntasks=1                 # Number of tasks (processes)
#SBATCH --cpus-per-task=4          # Number of CPU cores per task
#SBATCH --mem=4G                   # Total memory per node
#SBATCH --time=00:30:00            # Time limit (hh:mm:ss)
#SBATCH --partition=urcdtest-med   # Partition name

# Load necessary modules
module load openmpi

# Run your application
python my_script.py

The #SBATCH directives describe parameters used to allocate and configure the job. E.g. #SBATCH --nodes=1 tells the scheduler to allocate one node for the job. These are mostly identical to the command line options for salloc. output and error list files where STDOUT and STDERR will be written, respectively. The symbol %j in the output and error file names will be replaced by the job number and %x will be replaced by job name.

Submit the job

sbatch my_slurm.job

Job Dependencies

Submit a job that starts only after another completes:

sbatch --dependency=afterok:JOBID my_dependent_job.slurm

This line schedules my_dependent_job.slurm to start only if JOBID finishes successfully.

Advanced Tips

  • Resource Optimization: Adjust --cpus-per-task and --mem according to your job's requirements for optimal resource use.
  • Array Jobs: Easily submit multiple similar jobs using job arrays with sbatch --array=0-9 my_slurm.job