HPC jobs

Modified

September 16, 2024

Best practices for running a job on a cluster

Job scheduler

Note

There are many job scheduler programs available but a very common one is SLURM. Useful commands:

# Submit the job
sbatch 
# Check the job is in the queue
squeue

Toy example of a bash script to submit

#!/bin/bash
#SBATCH -D /home/USERNAME/  # working directory
#SBATCH -c 1        # number of CPUs. Default: 1
#SBATCH -t 00:10:00 # time for the job HH:MM:SS.
#SBATCH --mem=1G    # RAM memory

# my commands: software, pipeline, etc.
snakemake -j1

Job parallelisation

Job parallelization is crucial for achieving high performance and running jobs effectively on an HPC. Here are two key scenarios where it is particularly important:

Independent computational tasks: When tasks are independent of each other, job parallelization can enhance efficiency by allowing them to run concurrently.
Multi-threaded tools: Some tools are specifically designed to perform parallel computations through multi-threading, enabling them to utilize multiple CPU cores for increased performance.

Job parallelisation using slurm

Jobs arrays -a

Efficient resource usage

Sources

Useful links

Acknowledgements

Copyright

CC-BY-SA 4.0 license