Introduction to Genomic Data Science
About this courseSkip About this course
In the first half of this course, we'll investigate DNA replication, and ask the question, where in the genome does DNA replication begin? You will learn how to answer this question for many bacteria using straightforward algorithms to look for hidden messages in the genome.
In the second half of the course, we'll examine a different biological question, and ask which DNA patterns play the role of molecular clocks. The cells in your body manage to maintain a circadian rhythm, but how is this achieved on the level of DNA? Once again, we will see that by knowing which hidden messages to look for, we can start to understand the amazingly complex language of DNA. Perhaps surprisingly, we will apply randomized algorithms to solve problems.
Finally, you will get your hands dirty and apply existing software tools to find recurring biological motifs within genes that are responsible for helping Mycobacterium tuberculosis go "dormant" within a host for many years before causing an active infection.
This course begins a series of classes illustrating the power of computing in modern biology.
At a glance
What you'll learnSkip What you'll learn
- Write Python programs to solve various tasks you may encounter
- Formulate a formal computational problem from an informal biological problem
- Develop algorithms for solving computational problems
- Evaluate the effectiveness of algorithms
- Apply existing software to actual biological datasets
Welcome! A brief introduction to the course and its logistics.
Week 1: A Journey of a Thousand Miles
What does a cryptic message leading to buried treasure have to do with biology? Many cellular processes are encoded as "secret messages" within an organism's DNA. But how do we decipher these messages?
Week 2: Finding Replication Origins.
We examine the details of DNA replication and apply these details to design an intelligent algorithmic approach to find the replication origin in a bacterial genome.
Week 3: Hunting for Regulatory Motifs.
Your cells "tell time" and maintain your circadian clock by turning genes on and off during the day in set patterns. This brings us to a different kind of "secret message" problem in biology: how do we find the motifs hidden in DNA that switch on genes? We develop introductory algorithms for motif-finding in genes.
Week 4: How Rolling Dice Helps Us Find Regulatory Motifs.
We see how to improve upon these motif-finding approaches by designing randomized algorithms that can "roll dice" to find motifs and perform quite well in practice.
Week 5: Finishing Up
Bioinformatics Application Challenge: Motif-Finding. We use popular software built on the motif-finding algorithms that we learned to hunt for motifs in a real biological dataset.
In an end-of-the course assessment, we will ask you to answer Course Review questions. This will give you the opportunity to let us know how the course went for you. This assessment will provide data for our research study and will help us improve our courses for future learners.