Skip to content

Introduction to Population Genomics

A course of the danish health data science sandbox

This course is based on the material developed for the Population Genomics course at Aarhus university. The material is organized in four separated jupyter notebooks in both R, bash, and python, where you will benefit of an interactive coding setup.

If you use any of this material for your research, please cite this course with the DOI below, and acknowledge the Health Data Science Sandbox project of the Novo Nordisk Foundation (grant number NNF20OC0063268). It is of great help to support the project and the creation of new courses. DOI

Course description

The course introduces key concepts in population genomics from generation of population genetic data sets to the most common population genetic analyses and association studies. The first part of the course focuses on generation of population genetic data sets. The second part introduces the most common population genetic analyses and their theoretical background. Here topics include analysis of demography, population structure, recombination and selection. The last part of the course focus on applications of population genetic data sets for association studies in relation to human health.


Prerequisites

This is an introductory course that needs a basic understanding of genomics, and not necessarily programming experience (thought that helps).

Learning Outcomes

After the course, you will have detailed knowledge of the methods and applications required to perform a typical population genomic study. You will be able to:

  • Identify an experimental platform relevant to a population genomic analysis.
  • Apply commonly used population genomic methods.
  • Explain the theory behind common population genomic methods.
  • Reflect on strengths and limitations of population genomic methods.
  • Interpret and analyze results of population genomic inference.
  • Formulate population genetics hypotheses based on data

Supporting material

  • jupyter notebooks for interactive coding
  • Structure of the course with lecture list

The curriculum for each week of the course is listed below. "Coop" refers to a set of lecture notes by Graham Coop that are used throughout the course.

Course duration and structure

This course is one-semester long.

  1. Course intro and overview:
  2. Lecture (Kasper): Coop chapters 1, 2, 3, Paper: Genome Diversity Project
    • Exercise: Cluster practicals
  3. Drift and the coalescent:
  4. Recombination:
  5. Population strucure and incomplete lineage sorting:
  6. Hidden Markov models:
  7. Ancestral recombination graphs:
  8. Past population demography:
  9. Direct and linked selection:
  10. Admixture:
  11. Lecture: Review: Admixture, Paper: Admixture inference
  12. Exercise: Detecting archaic ancestry in modern humans
  13. Genome-wide association study (GWAS):
  14. Heritability:
    • Lecture: Coop Lecture notes Sec. 2.2 (p23-36) + Chap. 7 (p119-142)
    • Exercise: Association testing
  15. Evolution and disease:

Course authors

Head of the course: Kasper Munch.

Contact: Samuele Soraggi (samuele at birc.au.dk).