CS 581 (Fall 2023): Algorithmic Computational Genomics

Instructor: Tandy Warnow, Grainger Distinguished Chair in Engineering Tandy Warnow

Time: TuTh 11 AM to 12;15, 2018 Campus Instructional Facility. On occasion, lectures will be given by by zoom (link will be sent to registered students, and posted in Moodle).

Office hours : Before October 17, Tuesdays, 2-3, by zoom (link will be emailed to registered students, and posted in Moodle). Starting on October 17, office hours will be by appointment only.


Teaching Assistant: Yasamin Tabatabaee. Office hours: Fridays 11-12 by zoom. The zoom link is posted in the Moodle course page.

Lectures: These will be posted at least 24 hours in advance. You are expected to read these before coming to class.

Homework: These include reading assignments as well as problem sets. You are expected to do all the reading by the due date.

Course description: This is a course on applied algorithms, focusing on the use of discrete mathematics, graph theory, probability theory, statistics, machine learning, and simulations, to design and analyze algorithms for phylogeny (evolutionary tree) estimation, multiple sequence alignment, genome-scale phylogenetics, with extra topics based on student interest (e.g., genome assembly and annotation, and metagenomics). See the detailed syllabus for a more detailed descrption of the course material. Each of these biological problems is important and unsolved, so that new methods are needed. Every year, at least one student in the course has done a project that was subsequently published in scientific conferences and journals (see this page); you can be one of these students!

Course project The course requires a final project of each student. You are strongly encouraged to do a research project, but you can also do a survey paper on some topic relevant to the course material. In both cases, your project should be a paper of at least 3000 words (not including the bibliography) in a format and style appropriate for submission to a journal such as Bioinformatics (however, please provide this in a single column format, not double column). Research projects can involve two students, but survey papers must be done by yourself. (Note: When projects are done by two students, the division of the work should be communicated in the write-up, and each student should submit their own course project write-up. Please see me to discuss requirements regarding division of work and write-ups for your specific project, if this applies to you.) Grades on the final project depend upon the kind of project you do. For a research paper, your grade will be 30% writing and 70% content. If you do a survey paper, the grade will be 40% writing and 60% content. In both cases, you should include a thoughtful discussion of the relevant literature and have an appropriate bibliography. Note also the requirements for reproducibility (for research papers) and the expectations about writing quality, so see this PDF for some writing advice. See also this page for suggested topics. I meet with each group several times during the semester to help them make progress, and the TA is also available to help. To see some of the papers that have resulted from these course projects, see this page. Finally, note that there is a required in-person presentation of the course project proposal. You will need to schedule this with me, and they will not take place during the class lecture.

Pre-requisites CS 374 and CS 361/STAT 361, or consent of the instructor; no biology background is required. As most of you did not do your undergraduate degrees at UIUC, you would not have taken these courses here. I am not concerned if you are a graduate student in the CS program, since this would imply you have this background anyway. But if you are a graduate student in another program, you will need to meet with me to discuss your background. The first homework, to some extent, will be used to evaluate your readiness for the course in terms of your background training. I will grade this myself and then meet with you if your performance on the homework does not reflect sufficient background. In that case, we should discuss options, including you switching to Credit/No Credit, dropping the course but auditing, etc.

COVID-19 precautions If you are ill, or have been exposed to COVID-19, or have recently tested positive for COVID-19, do not come to class or in-person office hours.

Who should take this class: The course is designed for graduate students in CS, ECE, Math, Statistics, and ISE; students from other programs may not have the mathematics training that is needed. However, no background in biology is required.

Undergraduate students: If you are an advanced undergraduate student and interested in taking the course, please email me to discuss your qualifications. I generally do not let undergraduate students into the class because this is a research-focused advanced course requiring many different skills (including theorem proving, implementation, analysis of algorithms, scientific literature reviews, etc.). However, if you are sufficiently advanced (preferably a senior with substantial coursework already completed that shows these multitude of skills), serious about the commitment necessary to do this course, and planning to apply for PhD programs, then I may allow you into the class.

Assigned reading The assigned reading will include papers from the scientific literature, as well as the required textbook Computational Phylogenetics: An introduction to designing methods for phylogeny estimation, published by Cambridge University Press. Nearly all the textbook will be covered during the class, and most of the homework will be taken from the textbook. You do need to get this textbook, therefore. Please check the campus bookstore for availability. Students have also obtained the book from Amazon or Cambridge University Press. An e-book is also available from Google Play.


Midterm The midterm will be handed out on October 5, and will be due on Oct 8 by 10 PM in Moodle. The midterm is take-home, and you are expected to do it entirely by yourself. If you have questions, please ask the TA or the instructor, rather than consulting others. The midterm review will be held Tuesday October 3 from 2-4 PM (Tandy covers 2-3 and Yasamin covers 3-4), but also see this review document.

Guidance on writing assignments. Many of the activities in this course involve writing, and the grade for these assignments depends in part on quality of the writing. This is specifically true for the final project. It's very important that you familiarize yourself with expectations about scholarly writing, and in particular with how to avoid plagiarizing. Please see the information in the Academic Integrity page and specifically note the instructions about plagiarism and how paraphrasing improperly can count as plagiarism. In addition, please see my write-up with guidelines for reviewing computational papers.


Absence policy Absences are allowed but the student is required to learn what was covered when they were not attending the lecture (via zoom). Note that the lectures are not recorded! For this reason, course presentations are provided on the course webpage.

Additional Syllabus statements