# | Date | General topic | Instructor | Resources |
---|---|---|---|---|
1 | 02/15/22 | Course overview and introduction to computational genomicsCourse overview, git/GitHub, history of genomics and sequencing technology, genomic data scale, popular topics in genomics, computing in genomics |
Nathan Sheffield | |
2 | 02/17/22 | Statistics and probability review 1Random Variables, Probability Distributions, Expectation, Variance, Moment-Generating Functions, Central Limit Theorem |
Stefan Bekiranov | |
3 | 02/22/22 | Statistics and probability review 2statistical tests, p-value, type I and type II errors, multiple testing corrections, FDR, ROC |
Chongzhi Zang | |
Unit 1: Genome |
||||
4 | 02/24/22 | Fundamental string matching algorithmsLocal vs. global alignment, Dynamic programming, Heuristic approaches, BLAST |
Aakrosh Ratan | |
5 | 03/01/22 | Suffix trees, Suffix arrays, and Burrows-wheeler transformShort-read alignments |
Aakrosh Ratan | |
6 | 03/03/22 | Bayes theorem, Likelihood, and Expectation-MaximizationVariant calling, Structural Variants |
Aakrosh Ratan | |
7 | 03/08/22 | Spring Recess |
|
|
8 | 03/10/22 | Spring Recess |
|
|
9 | 03/15/22 | De-bruijn graphs and String graphsGenome assembly |
Aakrosh Ratan | |
10 | 03/17/22 | Hidden Markov Models (HMMs)Gene-finding, CpG islands and Chromatin states, Gibbs sampling, Expectation maximization |
Aakrosh Ratan | |
11 | 03/22/22 | Linear Regression, Chi-Squared Test of IndependenceGenome Wide Association Studies, eQTLs |
Stefan Bekiranov | |
Unit 2: Epigenome |
||||
12 | 03/24/22 | Regulatory DNA, Transcription factors, Sequence motifsPWMs, information entropy, motif finding algorithms |
Chongzhi Zang | |
13 | 03/29/22 | ChIP-seq, Epigenome profiles, Peak detectionChIP-seq, read mapping, epigenomic profile construction, narrow peak calling |
Chongzhi Zang | |
14 | 03/31/22 | Epigenomic domains, Hierarchy and scales of genome structureHistone modifications, broad peak calling, chromatin domains, 3D genome basics |
Chongzhi Zang | |
15 | 04/05/22 | Genomic intervals: formats, data structures and algorithmsGenomic intervals; genomic interval file formats; interval operations; interval data structures (R-trees, B+ trees, NCList); interval search |
Nathan Sheffield | |
16 | 04/07/22 | ATAC-seq diagnostics and harmonizationATAC-seq count data; data diagnostics; clip functions; consensus peaks; tests of normality; quantile normalization; Q-Q plots; batch correction |
Nathan Sheffield | |
17 | 04/12/22 | Scalable computing in genomicsParallelization, workflow management, optimization, Big-Oh complexity, Efficiently processing large sequencing data |
Nathan Sheffield | |
Unit 3: Transcriptome |
||||
18 | 04/14/22 | Genomic data standards and reference genomesStandards and interoperability; GA4GH; Reference genomes; refget; sequence collections; APIs; other standards |
Nathan Sheffield | |
19 | 04/19/22 | K-mer analysisRNA pseudoalignment; membership testers; Bloom filters |
Nathan Sheffield | |
20 | 04/21/22 | Dimensionality reductionCurse of Dimensionality, PCA, NMF, t-SNE, UMAP |
Stefan Bekiranov | |
21 | 04/26/22 | Differential expression analysisMircoarray and Bulk RNA-seq Analysis |
Stefan Bekiranov | |
22 | 04/28/22 | Spatial omics, Encoding of genomic dataMERFISH, spatial transcriptomics, simplex encoding, Hamming codes |
Chongzhi Zang |
|
23 | 05/03/22 | Clustering, transcriptomic data integrationClustering algorithms, regulatory networks, transcriptional regulation |
Chongzhi Zang | |
Final presentations |
||||
24 | 05/05/22 | Final Presentations |
|
|
25 | 05/10/22 | Final Presentations |
|
Throughout the semester, there will be ~6 homework assignments. These assignments are typically programming assignments that involve implementing a method or algorithm or performing a data analysis. Assignments may also include written components or theoretical problems. The assignments will generally be assigned over the course of two weeks, but there is no fixed schedule and due dates will vary by assignment. Each assignment is worth 10% of the final grade.
Students should complete assignments individually. We want you to work together at the level of sharing ideas, concepts, or suggested functions or reading material. You should not share or seek out completed solutions to the assignments.
Each student will be assigned a single class session to serve as scribe. The role of the scribe is to take detailed notes for the class on the topic of the class session. This should include background preparation before the assigned class session, note-taking during the class session, expansion of related topics discussed in the class, and final polishing and writing up of the notes after the class session.
Other class members are welcome to also contribute to notes for any class session, but the primary responsibility for polishing and integrating the notes belongs to the scribe.
The notes should be submitted into the class scribe repository on GitHub (linked above).
At the end of the course, all class members will have access to a “book” of complied class notes, which will be public.
Students are expected to attend class. There is no textbook, but each lecture will have reading material posted. Students should read the lecture material before the lecture. You should plan to invest roughly 3 hours per week on reading the posted outside material. We will not have exams or test your reading, so it’s on your own and this is our guideline to make sure you get the most benefit from the class. You should feel flexible to increase/decrease this moderately according to topics where you have greater or less interest. The lectures will be most useful if you do the reading before the accompanying lecture so that you can come prepared with some background to ask questions.
The final presentation should cover one or more methods in computational genomics, which could be either a method we covered in class, or something that we did not cover that you want to present to the class. You should show us an introduction with context for the method, the details of the method, and then some research application. The presentations may focus on a particular application in the student’s research area. The application could also be a research question in extending the method, if you like. Aim for a 10 minute talk, with a couple of minutes for questions. Students should think of the final presentation as roughly equivalent to one homework assignment, in terms of expected preparation.
Given the diversity of instructors in the course, we do not plan to hold regular office hours, but students should feel free to reach out to any instructor via e-mail to schedule a meeting. We will be available to meet individually with students as needed.
If you need to miss a lecture, we will address it on a case-by-case basis. Possible ways to make up missed lectures could be, for example, you can go through the slides and study the topic on your own, and contribute your notes to the scribe repository, or we may record a lecture for you to review on your own.
We do not intend to record lectures generally, but instructors may decide to record on a lecture-by-lecture basis, either for students who are missing or for other reasons. The University prohibits the recording of live class sessions unless all students have been informed that recording will occur and may be stored. Therefore, we notify that classes may be recorded at the discretion of the instructor. Any recording will follow UVA protocol, that is, will be stored for instructional purposes with students enrolled in the same class during the same term, and may only be stored on University-owned password-protected sites.