HPC jobs
Best practices for running a job on a cluster
Job scheduler
Note
There are many job scheduler programs available but a very common one is SLURM
. Useful commands:
# Submit the job
sbatch
# Check the job is in the queue
squeue
Toy example of a bash script to submit
#!/bin/bash
#SBATCH -D /home/USERNAME/ # working directory
#SBATCH -c 1 # number of CPUs. Default: 1
#SBATCH -t 00:10:00 # time for the job HH:MM:SS.
#SBATCH --mem=1G # RAM memory
# my commands: software, pipeline, etc.
snakemake -j1
Job parallelisation
Job parallelization is crucial for achieving high performance and running jobs effectively on an HPC. Here are two key scenarios where it is particularly important:
- Independent computational tasks: When tasks are independent of each other, job parallelization can enhance efficiency by allowing them to run concurrently.
- Multi-threaded tools: Some tools are specifically designed to perform parallel computations through multi-threading, enabling them to utilize multiple CPU cores for increased performance.
Job parallelisation using slurm
Jobs arrays -a
Efficient resource usage
Sources
Useful links
Acknowledgements
Copyright
CC-BY-SA 4.0 license