Authors

Alba Refoyo Martinez

Jennifer Bartell

Samuele Soraggi

Modified

November 14, 2024

Welcome to HPC-Lab Hub

Note: Actively being developed

High Performance Computing (HPC) plays a crucial role for researchers by offering the computational speed and power needed to manage large and complex data sets, perform simulations, and address intricate problems that would be impractical or too time-consuming with standard computing methods.

Knowing which HPC resources are accessible and how to use them efficiently is essential for researchers. Making the most of these resources can significantly expedite research and drive innovation. By becoming proficient with the available tools and technologies, researchers can address complex challenges, analyze extensive data sets, and execute advanced simulations with increased speed and accuracy. This module provides essential knowledge on HPC resources and best practices for their utilization.

This module offers content for three distinct courses:

  • HPC Launch: Foundations on HPC and essential knowledge on national HPC resources
  • HPC Pipes: Best practices for using workflow management systems and computational environments with HPC
  • HPC ML (Machine Learning): Insights into applying HPC for machine learning tasks, including model training, data analysis, and optimization techniques.

By the end of all the modules, you will gain practical skills in promoting reproducibility through comprehensive training in HPC resource management, workflow pipelines, and computing environments.

General Course Goals

By the end of this workshop, you should be able to apply the following concepts in the context of Next Generation Sequencing data:

  • Understand the Importance of Research Data Management (RDM)
  • Make your data analysis and workflows reproducible and FAIR
  • Make FAIR environment using conda or Docker
HPC best practices

We offer in-person workshops, keep an eye on the upcoming events on the Sandbox website.

Acknowledgements

Our interactive exercises are developed using the R packaged developed by Barr and DeBruine (2023).

References

Barr, Dale, and Lisa DeBruine. 2023. Webexercises: Create Interactive Web Exercises in r Markdown (Formerly Webex). https://github.com/psyteachr/webexercises.
Wagner, Adina S, Laura K Waite, Małgorzata Wierzba, Felix Hoffstaedter, Alexander Q Waite, Benjamin Poldrack, Simon B Eickhoff, and Michael Hanke. 2022. “FAIRly Big: A Framework for Computationally Reproducible Processing of Large-Scale Data.” Scientific Data 9 (1): 80.
Wratten, Laura, Andreas Wilm, and Jonathan Göke. 2021. “Reproducible, Scalable, and Shareable Analysis Pipelines with Bioinformatics Workflow Managers.” Nature Methods 18 (10): 1161–68.