HPC Lab
  • Home
  • HPC Launch
  • HPC Pipes
  • Workshop
  1. HPC Launch
  2. Welcome to the HPC-Launch workshop
  • UCloud setup
    • UCloud project workspace
    • SSH on UCloud
    • GitHub on UCloud
    • Conda on UCloud
  • HPC Launch
    • Welcome to the HPC-Launch workshop
    • Managing data
    • Knowledge Checks
  • HPC Pipes
    • Welcome to the HPC-Pipes workshop
    • Day 1
      • Day 1 - Part 1
      • Day 1 - Part 2
    • Day 2
      • Day 2 - Part 3
      • Day 2 - Part 4
      • Day 2 - Part 5

On this page

  • Welcome to the HPC-Launch workshop
  • Access Sandbox resources
  • Reading material
  • Agenda
  • Discussion and feedback
  1. HPC Launch
  2. Welcome to the HPC-Launch workshop

Welcome to the HPC-Launch workshop

PLEASE READ BEFORE COURSE!

Required preparation

You are expected to make sure you can sign in to UCloud, SDU’s HPC platform on which we will be running this course. All data, exercises, and tools will be provided on UCloud. Please use your university ID to sign in (instructions below). If you run into problems, please write us (respond to the email that got you to this page).

A Danish university/institution account is required to create a UCloud account. If you don’t have one, click on Local setup below.

Access Sandbox resources

Our first choice is to run our tutorials on UCloud, the supercomputer located at the University of Southern Denmark. Anyone using these resources needs the following:

  1. A Danish university ID so you can sign on to UCloud via WAYF1.

 

for UCloud Access click here

 

  1. Basic ability to navigate in Linux. You don’t need to be an expert, but it is beyond our ambitions (and course material) to teach you how to code from zero.

  2. Click the invite link below to accept our the invitation to the Sandbox workspace (after accessing UCloud for the first time). This way, we can provide you with compute resources for the active sessions of the workshop2.

 

Invite link to UCloud workspace

 

  1. Create an account on GitHub. Follow these instructions.

  2. You’re all set! You will receive instructions on how to navigate through UCloud during the course.

 

  1. Download the slides (link active on April 8th) and RDM figure:
  • RDM roadmap

Independent users

If you are interested in running the exercises locally, click on the box below. Not recommended for paritipants in our workshops.

Local setup required if not using UCloud

You are expected to complete the required setup, including tool installation and account creation.

  • Git for version control of your projects

  • GitHub account for hosting and collaborating on projects

  • Python

  • pip for managing Python packages

  • Cookicutter for creating folder structure templates (pip install cookiecutter)

  • md5sum. See below how to install

    Terminal
    # ---- cookiecutter -----
    pip install cookiecutter
    
    # ---- md5sum from coreutils package-----
    # On Ubuntu/Debian
    apt-get install coreutils
    # On macOS
    brew install coreutils

Highly recommended

  • Zenodo account for archiving and sharing your research outputs
  • DeiC DMP

If you run into any issues installing the software, don’t worry! We will provide access to a Danish HPC platform, UCloud, with all the necessary software pre-installed. Please read the next section carefully.

Reading material

About Research Data Management (RDM):

  • Sandbox Research data management
  • The Turing way

About High-Performance Computing (HPC):

  • Nvidia HPC

Agenda

Time Activity Time Activity
8:45 Morning coffee (optional)
9:00 Introduction to the Sandbox project 12:00 Lunch break
9:15 Introduction to HPC: the basics 13:00 RDM Step-by-step I
10:15 Coffee break 14:15 Coffee break
10:30 HPC workflow 14:30 RDM Step-by-step II
11:15 Intro to RDM 15:00 DK HPC solutions & resources

Discussion and feedback

We hope you enjoyed the workshop. As data scientists, we also would be really happy for some quantifiable info and feedback - we want to build things that the Danish health data science community is excited to use. Please, fill up the feedback form before you head out for the day 3.

 

Nice meeting you and we hope to see you again!

About the National Sandbox project

The Health Data Science Sandbox aims to be a training resource for bioinformaticians, data scientists, and those generally curious about how to investigate large biomedical datasets. We are an active and developing project seeking interested users (both trainees and educators). All of our open-source materials are available on our Github page and can be used on a computing cluster! We work with both UCloud, GenomeDK and Computerome, the major Danish academic supercomputers.

Footnotes

  1. Other institutions (e.g. hospitals, libraries, …) can log on through WAYF. See all institutions here.↩︎

  2. To use Sandbox materials outside of the workshop: remember that each new user has hundreds of hours of free computing credit and around 50GB of free storage, which can be used to run any UCloud software. If you run out of credit (which takes a long time) you’ll need to check with the local DeiC office at your university about how to request compute hours on UCloud. Contact us at the Sandbox if you need help or want more information.↩︎

  3. link activated on the day of the workshop.↩︎

Copyright

CC-BY-SA 4.0 license