Redirecting...

This documentation is no longer maintained.
Please go to the new site: Spartan HPC documentation

Job Submission Structure

A job file, after invoking a shell (e.g., #!/bin/bash) consists of two bodies of commands. The first is the directives to the scheduler, indicated by lines starting with #SBATCH. These are interpeted by the shell as comments, but the Slurm scheduler understands them as directives. These directives include the resource requests of the job, such as the number of nodes, the number of cores, the maximum amount of time the job will run for, email notifications and so forth. Following this, and only following this, come the shell commands. This should include any modules that the job will need to load to run their job, and the commands that the job will run. This can include calls to other scripts.

The following should be considered when writing a job submission script:

All scripts must invoke a shell. In most cases this will be the bash shell; #!/bin/bash
The submission system has default values; a default wall clock limit of 10 minutes (this will almost certainly need to be changed), a default partition (sapphire), a default number of tasks (1), a default number of CPUs per task (1), and a default number of nodes (1).
If a shared-memory multithreaded job is being submitted, or subprocesses to make use of additional cores requested, the job should be set to have the number of tasks = 1 and set a numer of CPUs-per-task equal to the number of threads desired, with a maximum equal to the node that the job is running on. For example: #SBATCH --ntasks=1 followed by #SBATCH --cpus-per-task=8 on a new line)
If a distributed-member message passing job is being submitted, the job can request more than a single compute node and multiple tasks. If the job has more tasks than available cores the scheduler will make an effort to have the cores contiguous (e.g., #SBATCH --ntasks=8). To force a specific number of cores per node, use the --ncpus-per-node= option.

Certain applications have their own built-in parallelisation approaches, which nevertheless require Slurm directives. MATLAB, for example, uses a "parallel pool" (parpool) approach. See our MATLAB page for more information.

Job Script Examples

A compilation of example job scripts for various software packages exists on Spartan at /apps/examples/. A copy of this repository is kept at https://gitlab.unimelb.edu.au/hpc-public/spartan-examples. These examples include both scheduler directives and application tests. Additional examples may be added by contacting our helpdesk.

Job Script Generator

There is a simple web-based job script generator to help compose jobs.

Job memory allocation

By default the scheduler will set memory equal to 4000MB multiplied by the number of cores requested. In some cases this might not be enough (e.g., very large dataset that needs to be loaded with low level of parallelisation).

Additional memory can be allocated with the --mem=[mem][M|G|T] directive (memory per node allocate to job) or --mem-per-cpu=[mem][M|G|T] (per core allocated to the job).

Multi-threaded applications

Multi-threaded applications operate faster than single-threaded applications on CPUs with multiple cores. Moreover, multiple threads of one process share the same resources (such as memory).

For multithreaded programs based on Open Multi-Processing (OpenMP), the number of threads to be used is defined by the environment variable OMP_NUM_THREADS. By default this variable is set to OMP_NUM_THREADS=1

To set the number of OpenMP threads to the number of CPUs you request for the job, add

export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}

to your Slurm submission script

e.g.

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --cpus-per-task=4
#SBATCH --time=1-0
#SBATCH --mem=5000
#SBATCH --partition sapphire

export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
echo "Running on ${SLURM_JOB_CPUS_PER_NODE} cores with ${OMP_NUM_THREADS} threads"

./myapplication --input mydata

Interactive Jobs

An alternative to submitting a batch job is that you can perform interactive work using the sinteractive command. This is handy for testing and debugging. This will allocate and log you in to a computing node.

Example

Interactive job on interactive partitionInteractive job on sapphire partition

Spartan has an interactive partition, which provides instant access for up to 8 CPU cores and 96GB RAM, for up to 2 days. To get an interactive job for 1hr with 1 CPU, run:

sinteractive -p interactive --time=01:00:00 --cpus-per-task=1

Note that there are limits to the number of jobs you can run, and the max memory and CPU you can request on the interactive partition. The limits can be seen on CPU/Memory quotas .

The interactive partition has CPU, RAM and time limits. If you need more resources for your interactive job, you can request an interactive job on the sapphire partition. To run an interactive job with 8 CPU cores, 128GB RAM for 7 days, run:

sinteractive -p sapphire --time=7-0:0:0 --cpus-per-task=8 --mem=128G

See examples, including with X11-windows forwarding, at /apps/examples/Interact. An X11 client is required for local visualisation (e.g., xming or MobaXterm of MS-Windows, XQuartz for Mac OS).
There is also the Open OnDemand service to allow you to use a graphical session on Spartan.
Jupyter Notebooks or RStudio can also run on Spartan, through an Open OnDemand service to start these applications and allow you to access them in a web browser.

Job Arrays

Job arrays are great for kicking off a large number of independent jobs at once with the same job script. For instance, batch processing a series of files, and the processing for each file can be performed independently of any other. Connsider an example of array of files, data_1.dat to data_50.dat to process with myProgram:

#!/bin/bash
#SBATCH --ntasks=1
#SBATCH --time=0-00:15:00
#SBATCH --array=1-50
myProgram data_${SLURM_ARRAY_TASK_ID}.dat

There are two components in use here; first the scheduler directive (#SBATCH --array=1-50) for the array tasks, then the variable which calls the tasks (data_${SLURM_ARRAY_TASK_ID}.dat).

This will create 50 jobs, each calling myProgram with a different data file. These jobs will run in any order, as soon as resources are available (potentially, all at the same time!).

Directives may be set as a range (e.g., #SBATCH --array=0-31), as comma separated values (#SBATCH --array=1,2,5,19,27), or with a step value (e.g., #SBATCH --array=1-7:2). You can also limit the number of simultaneously executing array instances (e.g. #SBATCH --array=0-10%4 to limit to a maximum of 4 simultaneous jobs or #SBATCH --array=0-10%1 to create a singleton array). This can be helpful, for example, when using licenses with limited seat counts or limiting the amount of simultaneous I/O to a dataset.

See the examples on Spartan at: /apps/examples/Array and /apps/examples/Octave.

Job Dependencies

It is not unusual for a user to make the launch of one job dependent on the status another job. The most common example is when a user wishes to make the output of one job the input of a second job. Both jobs can be launched simultaneously, but the second job can be preventing from running before the first job has completed successfully. In other words, there is a conditional dependency on the job.

Several conditional directives to be placed on a job which are tested prior to the job being intiatied, which are summarised as after, afterany, afterok, afternotok, and singleton. These can be submitted at submission time (e.g., sbatch --dependency=afterok:$jobid1 job2.slurm). Multiple jobs can be listed as dependencies with colon separated values (e.g., sbatch --dependency=afterok:$jobid1:$jobid2 job3.slurm).

Some dependency types

Directive	Value
after:jobid[:jobid...]	job can begin after the specified jobs have started
afterany:jobid[:jobid...]	job can begin after the specified jobs have terminated
afternotok:jobid[:jobid...]	job can begin after the specified jobs have failed
afterok:jobid[:jobid...]	job can begin after the specified jobs have run to completion with an exit code of zero (see the user guide for caveats).
singleton	job can begin execution after all previously launched jobs with the same name and user have ended.

See examples on Spartan at: /apps/examples/depend.

Job Output and Errors

By default Slurm combines all job output information into a single file with the job ID. This can be seen when a job starts running with a name like slurm-17439270.out, for example. The information in this file will include output from scripts run in a job and error information. Sometimes it is desireable to separate the output and error information, in which case directives like the following can be used in the jobscript:

#SBATCH -o "slurm-%N.%j.out" # STDOUT
#SBATCH -e "slurm-%N.%j.err" # STDERR

This will create two files, one for output and one for error, which specify the jobID and the compute nodes that the job ran on.

Sometimes, the word "Killed" can be seen in the job error or output log, for example

/var/spool/slurm/job8953391/slurm_script: line 7: 12235 Killed                  python /home/user345/examples/mem_use/mem_use.py

This is normally due to your job exceeding the memory you requested. By default, jobs on Spartan are allocated a certain amount of RAM per CPU requested. This is equal to the memory of the node divided by the number of cores allocated. Increasing memory allocated to a job can be achieved with the #SBATCH --mem=[mem][M|G|T] directive (this is per node that your job runs on, in megabytes, gigabytes, terabytes) or #SBATCH --mem-per-cpu=[mem][M|G|T] for memory per core.

Sometimes a job output does not end up in the output file due to buffering. Buffering is where the output is put into a queue, awaiting the buffer to be flushed. This normally happens automatically, but if the output isn't large, or if the processor is busy doing other things, it can take some time for the buffer to be flushed. To remove buffers in a job and get the output immediately, run the command that makes the output in the jobscript with stdbuf. e.g. instead of python myscript.py in a jobscript, replace it with stdbuf -o0 -e0 python myscript.py

Monitoring job memory, CPU and GPU utilisation

As a result of the feedback obtained by the 2020 Spartan HPC user survey, a job monitoring system was developed.

This allows users to monitor the memory, CPU and GPU usage of their jobs via a simple command line script.

For more details, please see Job Monitoring

GPU Partitions

Spartan hosts a GPGPU service, based on Nvidia A100 Ampere gpus. More information can be found on our GPU page.

Scheduler Commands and Directives

Slurm User Commands

User Command	Slurm Command
Job submission	`sbatch [script_file]`
Job delete	`scancel [job_id]`
Job status	`squeue -j [job_id]`
Job status	`squeue --me`
Node list	`sinfo -N`
Queue list	`squeue`
Cluster status	`sinfo`

Slurm Job Commands

Job Specification	Slurm Command
Script directive	`#SBATCH`
Partition	`-p [partition]`
Job Name	`--job-name=[name]`
Nodes	`-N [min[-max]]`
Task (MPI rank) Count	`-n [count]`
Wall Clock Limit	`--time=[days-hh:mm:ss]`
Event Address	`--mail-user=[address]`
Event Notification	`--mail-type=[events]`
Memory (per node) Size	`--mem=[mem][M\|G\|T]`
Memory (per CPU) Size	`--mem-per-cpu=[mem][M\|G\|T]`

Slurm Environment Variables

Environment Command	Environment variable
Job ID	`$SLURM_JOBID`
Submit Directory	`$SLURM_SUBMIT_DIR`
Submit Host	`$SLURM_SUBMIT_HOST`
Node List	`$SLURM_JOB_NODELIST`
Job Array Index	`$SLURM_ARRAY_TASK_ID`