Welcome to RDM for biodata
The course “Research Data Management (RDM) for biological data” is designed to provide participants with foundational knowledge and practical skills in handling the extensive data generated by modern studies. It emphasizes the importance of Open Science and FAIR principles in managing data effectively. This course covers essential principles and best practices guidelines in data organization, metadata annotation, version control, and data preservation. These principles are explored from a computational perspective, ensuring participants gain hands-on experience in applying them to real-world scenarios in their research labs, hence, helping them in their daily data analysis work. Additionally, the course delves into FAIR principles and Open Science, promoting collaboration and reproducibility in research endeavors. By the course’s conclusion, attendees will possess essential tools and techniques to address the data challenges prevalent in today’s research landscape, with a focus on fields related to omics, health and bioinformatics.
- 📖 Syllabus:
- Data Lifecycle Management
- Data Management Plans (DMPs)
- Data Organization and storage
- Documentation standards for biodata
- Version Control and Collaboration
- Processing and analyzing biodata
- Storing and sharing biodata
- ⏰ Total Time Estimation: X hours
- 📁 Supporting Materials:
- 👨💻 Target Audience: Ph.D., MSc, anyone interested in RDM for NGS data or other related fields within bioinformatics.
- 👩🎓 Level: Beginner.
- 🔒 License: Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.
- 💰 Funding: This project was funded by the Novo Nordisk Fonden (NNF20OC0063268).
This course offers participants with an in-depth introduction to effectively managing the vast amounts of data generated in modern studies. Throughout the program, emphasis is placed on practical understanding of RDM principles and the importance of efficient handling of large datasets. In this context, participants will learn the necessity of adopting Open Science and FAIR principles for enhancing data accessibility and reusability Special attention is given to the development of Data Management Plans (DMPs) with examples tailored to omics data, ensuring compliance with institutional and funding agency requirements while maintaining data integrity.
Despite DMPs being essential, they are often too general and lack specific guidelines for practical implementation. That is why we have designed this course to cover practical aspects in detail. Participants will acquire practical skills for organizing data, including the creation of folder and file structures, and the implementation of metadata to facilitate data discoverability and interpretation. Attendees will also gain insights into the establishment of simple databases and the use of version control systems to track changes in data analysis, thereby promoting collaboration and reproducibility. The course concludes with a focus on archiving and data repositories, enabling participants to learn strategies for preserving and sharing data for long-term scientific usage. By the end of the course, attendees will be equipped with essential tools and techniques to effectively navigate the challenges prevalent in today’s research landscape. This will not only foster successful data management practices but also enhance collaboration within the scientific community.
- Basic understanding Next Generation Sequencing data and formats.
- Command Line experience
- Basic programming experience
- Quarto or Mkdocs tools
By the end of this workshop, you should be able to apply the following concepts in the context of Next Generation Sequencing data:
- Understand the Importance of Research Data Management (RDM)
- Familiarize Yourself with FAIR and Open Science Principles
- Draft a Data Management Plan for your own Data
- Establish File and Folder Naming Conventions
- Enhance Data with Descriptive Metadata
- Implement Version Control for Data Analysis
- Select an Appropriate Repository for Data Archiving
- Make your data analysis and workflows reproducible and FAIR
This is a computational workshop that focuses primarily on the digital aspect of our data. While wet lab Research Data Management (RDM) involving protocols, instruments, reagents, ELM or LIMS systems is integral to the entire RDM process, it won’t be covered in this course.
As part of effective data management, it’s crucial to prioritize strategies that ensure security and privacy. While these aspects are important, please note that they won’t be covered in our course. However, we highly recommend enrolling in the GDPR course offered by Center for Health Data Science, specially if you’re working with sensitive data. This course specifically focuses on GDPR compliance and will provide you with valuable insights and skills in managing data privacy and security.
Danish institutional RDM links
Acknowledgements
- RDMkit, ELIXIR (2021) Research Data Management Kit. A deliverable from the EU-funded ELIXIR-CONVERGE project (grant agreement 871075).
- University of Copenhagen Research Data Management Team.
- Martin Proks and Sarah Lundregan, Brickman Lab, NNF Center for Stem Cell Biology (reNEW), University of Copenhagen.
- Richard Dennis, Data Steward, NNF Center for Stem Cell Biology (reNEW), University of Copenhagen.
- NBISweden.