CS 466: Introduction to Bioinformatics
Detailed Syllabus:

Jan 18: Introduction to course
(PPTX)
(PDF)

Jan 20 to 27:
Core CS/Math material: graphs, counting,
running time analysis,
and NPhardness. (4 classes).
 Reading assignment:

Jan 20 (review of CS 173 material):
Appendix B.1B.4 from
Computational Phylogenetics (review of 173
review), and
running time analysis and BigO notation
(PDF),

Jan 23 (P and NP):
Appendix B.5 from
Computational Phylogenetics

Jan 25 (NPhardness):
Appendix B.6B.7 from
Computational Phylogenetics
and (PDF).
 Jan 27:
(a) Combinatorial counting (PDF)
(PPTX) and
(b) Eulerian graphs
(PDF),
slides by Jeremy Martin, of the University of Kansas.
 Lectures:
 Graphs (PDF) (January 20)
 Running time analysis (PDF)
(January 23)
 NPhardness (PDF)
(January 25)

Using induction to establish bounds (e.g., BigO)
(PDF) and
Combinatorial counting
(PPTX)
(PDF)
(January 27)
 Jan 30 to Feb 1: Genome Assembly (2 classes)
 Reading assignment:
 Lectures:

Feb 3. Review, and reading quiz (grade will not count).

Feb 68.
Core CS/math material: dynamic programming.
(2 classes)
 Reading assignment:

Feb 6: Appendix B.8 from
from Computational Phylogenetics, and
Introduction to Dynamic Programming
(PDF)
 Lectures

Introduction to Dynamic Programming
(PDF)

Feb 10 and 13: Pairwise alignment and computing
minimum edit distances (2 classes)

Reading assignment:

Feb 10: Sections 9.19.4 from Computational Phylogenetics, and
Basics of molecular biology (PDF).

Lectures:

DP algorithm for pairwise edit distance
(PDF)

Tree alignment  Part 1
(PDF)

Feb 15 to 22: Hidden Markov models and
their applications in bioinformatics.
Core CS/math material:
introduction to probability theory.
(4 classes)
 Reading assignment:
 Feb 15: Sections 9.69.7.2 from Computational Phylogenetics,
and
Mark Paskin's Introduction to Probability Theory (PDF)
 Feb 17: Sections 9.7.3 and 9.7.4 from
Computational Phylogenetics, and
Mona Singh's course notes on profile HMMs
(PDF),

Optional Reading assignment:
J&P: Chapter 9.69.8
 Lectures:

Feb 15: Introduction to HMMs
(PPTX)
(PDF)

Feb 17:
Introduction to probability theory
(PDF),
notes created by Erin Molloy.

Feb 20:
The Viterbi Algorithm
(PDF)

Feb 22: Kruskal's algorithm for Minimum Spanning Trees

Feb 24 to March 3: Multiple sequence alignment.
(4 classes)
 Reading assignment:
 Feb 24: Section 9.5 and 9.7.5 from
Computational Phylogenetics.
Also read
Kruskal's Algorithm: Correctness Analysis
by Valentine Kabenets of Simon Fraser University.
Finally, see (PDF), scribe
notes for approximation
algorithms from http://www.cs.dartmouth.edu/~ac/Teach/CS105Winter05/.
 Feb 27: Sections 9.119.13, 9.15, 9.16 from Computational Phylogenetics
and
(PDF).
 March 3: Section 9.19 from Computational Phylogenetics
 Lectures:
 Feb 24: Tree alignment  part 2
(PDF).

Feb 27: The performance of methods for GTA
(PPTX)
(PDF)

March 1: More about
MSA in practice
(PDF)
(PPTX)
 March 3: Reading quiz

March 6, 8, 13, and 15 (March 10
is cancelled): Phylogenetic trees and
designing heuristics for NPhard problems
(4 classes)
 Reading assignment:
 March 6: Sections 1.11.3 from Computational Phylogenetics
 March 8: Sections 8.1 and 8.2 from Computational Phylogenetics
 March 13: Sections 8.4, 8.6, 8.8,
and
4.14.3 from Computational Phylogenetics
 March 15: Sections 11.1 and 11.2 from Computational Phylogenetics
 Lectures (note March 10 is cancelled):
 March 6: Introduction to distancebased tree estimation
(PDF)
(PPT)
 March
8, 13, and 15: Introduction to trees
(PDF) and
Statistical Phylogenetics
(PDF)
(PPT).
Please note that we covered material from Sections
1.41.6, 1.8, and 8.5
from the textbook (this is assigned
reading for next week).
 March 17: Evaluating methods

Reading assignment:
Sections 1.41.6, 1.8, 8.5,
Appendix B.10 from
Computational Phylogenetics,
and Research Ethics.
 Lecture
(PDF)

March 2024: SPRING BREAK

March 27 and 29:
Review for midterm.
(You are responsible for all assigned reading
and all content covered in any lecture, including
but not limited to
the posted PDFs and powerpoint slides.)
 March 31: Midterm exam (in class)
 April 3: Final project proposals due.
Discussion of midterm exam.

April 5 and 7: Tutorials on how to
compute
multiple sequence alignments and maximum likelihood
trees using standard software.
See https://pranj.al/mammaliantutorial.html for the dataset and links to software.

April 10.

Students present final project proposals
 Amanda Shen.
Literature review:
Comparison of
shortread mapping methods.
(PDF)
 Milica HadziTanovic.
Literature review:
Comparison of heuristics for maximum likelihood tree
estimation.
(PDF)

April 12.

Students present final project proposals:

Daniel Hefner.
Literature review: The Subgraph Isomorphism problem.
(PDF)

Lijun Huang.
Literature review:
Quartetbased tree estimation methods.
(PDF)

Baixin Zhang.
Literature review: The multiple
travelling salesman problem.
(PDF)
 Daniel Yuan and Stanley Liu. Research: MSA Benchmarking.
(PDF)

April 14.

Students present final project proposals:

Lily Barghi and Ashok Arjunakani.
Research: Exploring impact of varying
algorithmic parameters within SATe.
(PDF)

Ellen Nie and Hang Yu.
Research: Evaluating
two MSA methods on protein datasets.
(PDF)
 Kodi Collins.
Research study:
Modifying the design of PASTA and UPP to improve
largescale coestimation of alignments and trees.
(PDF)

Daryl Drake.
Research:
Comparison of alignment methods.
(PDF)

Ryan Healey. Literature Survey: The impact of edit
distance on alignment and tree estimation accuracy.
(PDF)

April 17.

Students present final project proposals:
 Allyson Julian. Literature Survey: Comparing MSA methods.
(PDF)

Ezra WinterNelson. Literature Survey:
Methods for estimating extinction and diversification rates.
(PDF)

William Zhang. Research: Comparison of
phylogenies computed using different methods
on a set of chloroplast genomes.
(PDF)

Jesus Sandoval.
Literature review:
Survey of fast methods for largescale tree estimation.
(PDF)

April 19.

Students present final project proposals:

Qingyuan Liu.
Literature review: Nonhomogeneous Hidden Markov Models.
(PDF)

Mayank Kale.
(PDF)

Reading assignment:
 "Fast, scalable generation of highquality
protein multiple sequence alignments using Clustal Omega"
by Sievers et al., Molecular Systems Biology (2011) 7: 539,
doi:10.1038/msb.2011.75,
available online at
http://msb.embopress.org/content/7/1/539.
Make sure you have read the paper, the supplementary
materials, and the reviewer comments provided in
"Transparent Process".
 April 21.
 Introduction to computational phylogenetics for
linguistics
(PDF)
 April 24.
Presentation by Aayushee Jain
of the paper
about Clustal Omega (reading assignment for April 19).
All other students must
read the posted papers, and ask at least two questions 
at least one of which is about the supplementary materials 
during the class presentation.
(The questions are due as homework on April 19.)

April 26.
Further discussion about alignment methods.
(PDF)

April 28.
Metagenomics.
(PDF)

May 1.
Phylogenomics.
(PDF)

May 3.
Last day! Student presentations of final projects.