Welcome to the HPC-Launch workshop
Agenda
Time | Activity | Time | Activity |
---|---|---|---|
8:45 | Morning coffee (optional) | ||
9:00 | Introduction to the Sandbox project | 12:00 | Lunch break |
9:15 | Introduction to HPC: the basics | 13:00 | Step-by-step: solutions I |
10:15 | Coffee break | 14:15 | Coffee break |
10:30 | DK HPC resources, access, and intro to UCloud | 14:30 | Step-by-step: solutions II |
11:15 | Intro to RDM for health data science | 16:00 | Discussions & Wrap-up |
Course requirements
You are expected to complete the required setup, including tool installation and account creation.
Git for version control of your projects
A Zenodo account for archiving and sharing your research outputs
pip for managing Python packages
Cookicutter for creating folder structure templates (
pip install cookiecutter
)-
Terminal
# ---- cookiecutter ----- pip install cookiecutter # ---- md5sum from coreutils package----- # On Ubuntu/Debian apt-get install coreutils # On macOS brew install coreutils
Highly recommended: a GitHub account for hosting and collaborating on projects
Note: If you encounter any issues, we will grant access to a Danish HPC platform where all the necessary software is pre-installed. Please read the next section carefully.
Using UCloud for exercises
- Create an account on UCloud
- Use the link below to join our workspace where you will find a setup environment1
Invite link to UCloud workspace
Discussion and feedback
We hope you enjoyed the workshop. As data scientists, we also would be really happy for some quantifiable info and feedback - we want to build things that the Danish health data science community is excited to use. Please, fill-up the feedback fork [LINK] before you head out for the day 2.
You can download our RDM roadmap here.
About the National Sandbox project
The Health Data Science Sandbox aims to be a training resource for bioinformaticians, data scientists, and those generally curious about how to investigate large biomedical datasets. We are an active and developing project seeking interested users (both trainees and educators). All of our open-source materials are available on our Github page and can be used on a computing cluster! We work with both UCloud, GenomeDK and Computerome, the major Danish academic supercomputers.