index – Bulk RNAseq data analysis

Modified

January 29, 2025

This workshop material includes a tutorial on how to approach RNAseq data, starting from your sequencing reads (fastq files). Thus, the workshop only briefly touches upon laboratory protocols, library preparation, and experimental design of RNA sequencing experiments, mainly for the purpose of outlining considerations in the downstream bioinformatic analysis. This workshop is based on the materials developed by members of the teaching team at the Harvard Chan Bioinformatics Core (HBC), a collection of modified tutorials from the DESeq2, R language vignettes and the nf-core rnaseq pipeline.

Course Overview

📖 Syllabus:

Course introduction
Experimental planning
Data explanation
Read reprocessing and preprocessing pipelines
Analysing RNAseq data
1. RNAseq counts
2. Exploratory analysis
3. Differential Expression Analysis
4. Functional analysis
Summarized workflow

⏰ Total Time Estimation: 8 hours
📁 Supporting Materials: Workshop slides with theory on bulk RNAseq can be found in this zenodo repository.
👨‍💻 Target Audience: Ph.D., MSc, etc.
👩‍🎓 Level: Beginner.
🔒 License: Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.
💰 Funding: This project was funded by the Novo Nordisk Fonden (NNF20OC0063268).

Course Requirements

Knowledge of R, Rstudio and Rmarkdown. It is recommended that you have at least followed our workshop R basics
Basic knowledge of RNAseq technology
Basic knowledge of data science and statistics such as PCA, clustering and statistical testing

The aim of this repository is to run a comprehensive but introductory workshop on bulk-RNAseq bioinformatic analyses. Each of the modules of this workshop is accompanied by a powerpoint slideshow explaining the steps and the theory behind a typical bioinformatics analysis (ideally with a teacher). Many of the slides are annotated with extra information and/or point to original sources for extra reading material.

Course Goals

By the end of this workshop, you should be able to analyse your own bulk RNAseq data:

Preprocess your reads into a count matrix.
Normalize your data.
Explore your samples with PCAs and heatmaps.
Perform Differential Expression Analysis.
Annotate your results.

Acknowledgements

We recognize the substantial contribution of José Alejandro Romero Herrera, a former team member, in developing the course material. Other members that have contributed to the development of this course:

Member	Role	Institution	PI
Jennifer Bartell	Project Manager, Data Scientist	Center for Health Data Science, KU	Anders Krogh
Diana Andrejeva	Data Scientist	Center for Health Data Science, KU	Anders Krogh
Samuele Soraggi	Data Scientist	Bioinformatics Research Centre, AU	Mikkel Schierup

We would also like to extend our gratitude to:

Center for Health Data Science, University of Copenhagen.
Hugo Tavares, Bioinformatics Training Facility, University of Cambridge.
Silvia Raineri, Center for Stem Cell Medicine (reNew), University of Copenhagen.
Harvard Chan Bioinformatics Core (HBC), check out their github repo
nf-core community

Course Instructors

Welcome to the bulk RNA-seq analysis workshop

Acknowledgements

Course Instructors

Adrija Kalvisa

Alba Refoyo Martinez

Henrike Zschach

Thilde Terkelsen