CS 581 (Fall 2021): Algorithmic Computational Genomics

Instructor: Tandy Warnow, Grainger Distinguished Chair in Engineering Tandy Warnow

Time: TuTh 9:30 AM to 10:45 AM, by zoom (link will be sent to registered students) - please email the instructor for the registration link

Office hours: in person Wednesdays 9-10 AM in 2400C (Terrace) or by zoom (depending on weather). All other office hours will be by appointment.

Teaching Assistant: Eleanor Wedell Office hours: Thursday 4-5 PM (zoom) and Friday 11 AM to 12 noon, 2nd floor 2400C (Terrace)

Lectures

Homework

Course description: This is a course on applied algorithms, focusing on the use of discrete mathematics, graph theory, probability theory, statistics, machine learning, and simulations, to design and analyze algorithms for phylogeny (evolutionary tree) estimation, multiple sequence alignment, genome-scale phylogenetics, with extra topics based on student interest (e.g., genome assembly and annotation, and metagenomics). See the detailed syllabus for a more detailed descrption of the course material. Each of these biological problems is important and unsolved, so that new methods are needed. Every year, at least one student in the course has done a project that was subsequently published in scientific conferences and journals (see this page); you can be one of these students!

Course project The course requires a final project of each student, and is due on the last day the class meets. You are strongly encouraged to do a research project, but you can also do a survey paper on some topic relevant to the course material. In both cases, your project should be a paper of at least 3000 words (not including the bibliography) in a format and style appropriate for submission to a journal such as Bioinformatics (however, please provide this in a single column format, not double column). Research projects can involve two students, but survey papers must be done by yourself. (Note: When projects are done by two students, the division of the work should be communicated in the write-up, and each student should submit their own course project write-up. Please see me to discuss requirements regarding division of work and write-ups for your specific project, if this applies to you.) Grades on the final project depend upon the kind of project you do. For a research paper, your grade will be 30% writing and 70% content. If you do a survey paper, the grade will be 40% writing and 60% content. In both cases, you should include a thoughtful discussion of the relevant literature and have an appropriate bibliography. Note also the requirements for reproducibility (for research papers) and the expectations about writing quality, so see this PDF for some writing advice. See also this page for suggested topics. I meet with each group several times during the semester to help them make progress, and the TA (Eleanor Wedell) is also available to help. To see some of the papers that have resulted from these course projects, see this page.

Pre-requisites CS 374 and CS 361/STAT 361, or consent of the instructor; no biology background is required.

COVID-19 precautions In an abundance of caution, I have decided to hold the course lectures online, and will provide the zoom link to registered students. However, I will schedule small group "office hours" in person and hold these outdoors while the weather permits; these in person office hours will require appointments, and will be limited to 2-3 students at a time, so that adequate distance can be maintained between everyone. I will also hold office hours by zoom each week (and of course these you don't need to schedule). If you are ill, have been exposed to COVID-19, or have recently tested positive for COVID-19, do not come to in-person office hours. Moreover, please keep several feet away from me, the TA, and other students, during the in-person office hours. Please also see this page for additional specific information about my COVID-19 policies.

Who should take this class: The course is designed for graduate students in CS, ECE, Math, and Statistics; no background in biology is required.

Undergraduate students: If you are an advanced undergraduate student (in CS, ECE, Mathematics, Physics, or Statistics) and interested in taking the course, please email me to discuss your qualifications. I generally do not let undergraduate students into the class because this is a research-focused advanced course requiring many different skills (including theorem proving, implementation, analysis of algorithms, scientific literature reviews, etc.). However, if you are sufficiently advanced (preferably a senior with substantial coursework already completed that shows these multitude of skills), serious about the commitment necessary to do this course, and planning to apply for PhD programs, then I may allow you into the class.

Assigned reading The assigned reading will include papers from the scientific literature, as well as the required textbook Computational Phylogenetics: An introduction to designing methods for phylogeny estimation, published by Cambridge University Press. Nearly all the textbook will be covered during the class, and most of the homework will be taken from the textbook. You do need to get this textbook, therefore. Please check the campus bookstore for availability. Students have also obtained the book from Amazon or Cambridge University Press. An e-book is also available from Google Play.

Grading

Midterm Your midterm is due in Moodle by October 10, 10 PM. If you have anything handwritten, you need to submit the hardcopy on Monday by 11 AM, to Candice Steidinger in Siebel 3.3240.

Guidance on writing assignments. Many of the activities in this course involve writing, and the grade for these assignments depends in part on quality of the writing. This is specifically true for the final project. It's very important that you familiarize yourself with expectations about scholarly writing, and in particular with how to avoid plagiarizing. Please see the information in the Academic Integrity page and specifically note the instructions about plagiarism and how paraphrasing improperly can count as plagiarism. In addition, please see my write-up with guidelines for reviewing computational papers.

Expectations

Additional Syllabus statements The College of Engineering has recommended several extra statements, which I agree with, and hence include here.