Introduction to Next Generation Sequencing data¶
A course of the Summer School of Aarhus University¶
Computing and didactical support from the Danish Health Data Science Sandbox
The material for this course is organized in four separated jupyter notebooks in both
R where you will benefit of an interactive coding setup on
After the course, you will have knowledge of bioinformatics methods for analyzing genomes using NGS data, including knowledge of the existing types of genome data, how the different types of data can be displayed and analyzed, the current methods for genome assembly and analysis, their accuracy and how they can be used. The course will enable you to devise and run a project that makes use of NGS data.
📚 Prerequisites: This is an introductory course that needs a basic understanding of the biology behind sequencing, and not necessarily programming experience (though this would help!).
1. Describe key challenges in the analysis of NGS data
2. Explain the theoretical foundation for methods that use NGS for assembly and analysis of genomes
3. Discuss the bioinformatic methods for genome analysis and hypothesize what drives the outcome of the methods
4. Review original literature within the subjects and relate the discussed topics to analysis scenarios
5. Apply bioinformatics tools within the selected application areas and reflect on the results, formulating your own conclusion in the proposed tasks
🕰 Total Time Estimation: 20 hours
📁 Supporting Materials:
- jupyter notebooks for interactive coding
- lecture slides from the instructor
You can find the links to the material in the table at the bottom of this page.
🖍 Course authors and instructors:
📝 Citation: If you use any of this material for your research, please cite this course with the DOI below, and acknowledge the Health Data Science Sandbox project of the Novo Nordisk Foundation (grant number NNF20OC0063268). It is of great help to support the project.
📧 Contact: Samuele Soraggi (samuele at birc.au.dk).
Course structure and Instructions
The course exercises are organized in four exercise modules.
The first one is executed with the web interface usegalaxy.org. Click on
1.Galaxy Exercise in the menu for instructions)
Afterwards, we will work on a computing environment to use
2.Instructions for instructions.
3.Course exercises contains all the compiled exercises as a reference.
Course material 2022¶
Here you find a table with the instructor's slides from 2022.
|Mapping to reference
|SNPs and structural variants
|Microbiomes and metagenomics
|Single cell RNA sequencing
Course material 2023 (on its way after the course's end)¶
Here you find a table with the instructor's slides and a link to the compiled notebooks, that you can also run on your own following the
instructions in this webpage. Data alignment can also be performed on the
Galaxy interactive webpage (see the
galaxy exercise in this webpage).