CS598/BioE598: Algorithmic Genomic Biology (2016)

Instructor: Tandy Warnow, Founder Professor of Engineering

This is the description of the 2016 CS598/BioE598 AGB course. The 2015 course description is located at 598-2015.html.

Tentative Course Schedule

Office hours: Mondays 12-1 and Tuesdays 4-5, Siebel Center 3235. After April 27, office hours only by appointment.

Course location: TR 11:00 AM - 12:20 PM, 1109 Siebel Center for Computer Science.

Course description: The purpose of the course is to give each student enough background and training in the area of algorithmic genomic biology so that you will be able to do research in this area, and publish papers. Every year, two or more students from this course have done final projects that were subsequently published in major scientific journals; you can be one of them! The main focus of the course is on phylogeny (evolutionary tree) estimation, multiple sequence alignment, and genome-scale phylogenetics, which are problems that present very interesting challenges from a computational and statistical standpoint. Time permitting, we will also discuss computational problems in microbiome analysis, protein function and structure prediction, genome assembly, and even historical linguistics. Students will learn the mathematical and computational foundations in these areas, read the current literature, and do a team research project. The course is designed for doctoral students in computer science, computer engineering, bioengineering, mathematics, and statistics, and does not depend on any prior background in biology. The technical material will depend on discrete algorithms, graph theory, simulations, and probabilistic analysis of algorithms.

Pre-requisites: No biology background is required, but students should have some mathematical maturity, and at least one undergraduate course in algorithm design, data structures, or probability theory. The course will be very mathematical, and being able to understand proofs and design methods with provable guarantees is the main point of this course. If you do not have the equivalent of an undergraduate degree in math, statistics, computer science, or some other engineering program, the course may not be appropriate for you. However, if you are a PhD student doing research in an area that relies strongly on phylogeny estimation, then please see me to discuss whether this course is appropriate for you.

Course materials: The initial half of the course is based on a textbook I am writing, Computational Phylogenetics. This is a draft; therefore, please let me know if you find any typos or other mistakes! The second half of the course will be based on papers in the recent scientific literature. You are expected to do all assigned reading (whether from the textbook or of published papers) in advance of coming to class.

Grading :

Homeworks: Homeworks are due in class by 11:10 AM; homeworks handed in on time can receive full credit. Homeworks handed in after the deadline but before 24 hours after the deadline receive 80% of the grade. Homeworks handed in after 24 hours but before 48 hours receive 60% of the grade. No homeworks will be accepted after 48 hours past the deadline. The single worst homework grade will be dropped.

Midterm Exam: This will be a take-home exam, and will be due back in class on March 29, 2016.

Final Project: The course requires a final project of each student, and is due May 5, 2016, in hardcopy and by email (you must give the hardcopy to me directly - in class or in my office hours). You are strongly encouraged to do a research project, but you can also do a survey paper on some topic relevant to the course material. In both cases, your project should be a paper (of about 15 pages) in a format and style appropriate for submission to a journal. Research projects can involve two students, but survey papers must be done by yourself. Grades on the final project depend upon the kind of project you do. For a research paper, your grade will be 30% writing, 40% scientific/algorithmic rigor, and 30% impact. If you do a survey paper, the grade will be 30% writing, 30% summary of the literature you discuss, and 40% commentary (i.e., insight, critical and thoughtful discussion of the issues that come up). See this page for a list of possible final projects. Also, here is a list of external software that may be helpful to you in your projects, in case you want to do phylogenetic analyses of datasets and compare results.

Class Presentation: The presentation of scientific papers is a major part of the course, and all students are expected to participate actively in discussing these papers.

Course Participation: Your course participation will be evaluated in terms of how you participate in the in-class discussions of the scientific literature we are reading, and also of the presentations of scientific papers given by the other students.

Academic integrity: You are expected to abide by the university academic integrity standards, which means (among other things) that you should never copy anyone else's homework nor let anyone copy your homework. This is particularly important for your final project, especially if you refer to the scientific literature in your project. You must also never plagiarize, which means (among other things) that any text that you copy from another document must be properly attributed (with quotation marks around the copied material, and citation to the document from which you have copied the material). Even paraphrasing can count as plagiarism. All violations of academic integrity standards will be reported to the appropriate university offices. Serious violations will result in a failing grade for the course. Please see this page for a brief discussion of this issue, and the real academic integrity page. The academic integrity code is applied to the homework assignments, as follows. You are encouraged to work with other students on the homework, but if you do this, this is what you should do. First, indicate on the homework who you worked with. Second, do not look at the other homework solutions when you write your own solutions; this includes not looking at someone else's write-up of a critique of some literature. Third, and more generally, you must write your homework solutions entirely on your own, using your own language. Please do not under any circumstances copy homework solutions from anyone else, or let anyone copy from you. Similarly, the academmic integrity code is applied to your final project by the expectation that you will not copy text from any paper, and you will give appropriate credit to all material that you use from prior publications, websites, etc.