Spring 2017 graduate course, CS 581 and BIOE 540: Algorithmic Computational Genomics

Instructor: Tandy Warnow, Founder Professor of Engineering

Course meets Tuesdays and Thursdays from 11-12:20

Office hours: Tu 12:30-1:30 PM in Siebel Center 3235

Teaching Assistant: Pranjal Vachaspati, SC 4105, Office Hours Friday 2-4.

Course schedule (tentative)

Homework assignments

Course description: The purpose of the course is to give each student enough background and training in the area of algorithmic genomic biology so that you will be able to do research in this area and publish papers. Every year, two or more students from this course have done final projects that were subsequently published in major scientific journals; you can be one of them! For examples of these papers, see Mirarab et al., Bioinformatics 2014, Zimmermann et al., BMC Genomics 2014, Davidson et al., BMC Genomics 2015, Chou et al., BMC Genomics 2015, Vachaspati and Warnow, BMC Genomics 2015, and Nute and Warnow, BMC Genomics 2016. The main focus of the course is on phylogeny (evolutionary tree) estimation, multiple sequence alignment, and genome-scale phylogenetics, which are problems that present very interesting challenges from a computational and statistical standpoint. However, we also cover genome assembly and annotation, computational problems in microbiome analysis, and protein function and structure prediction. Students will learn the mathematical and computational foundations in these areas, read the current literature, and do a team research project. The course is designed for doctoral students in computer science, computer engineering, bioengineering, mathematics, and statistics, and does not depend on any prior background in biology. The technical material will depend on discrete algorithms, graph theory, simulations, and probabilistic analysis of algorithms. Please see 598AGB webpage for the 2016 course, which has substantial overlap with this year's course.

Pre-requisites: CS 374 and CS 361/STAT 361, or consent of the instructor. No biology backround is required. If you did not take these pre-requisites at UIUC, you will need to get permission from me to stay in the course. This will include doing an extra homework assignment (due January 21) and meeting with me for a one-on-one meeting after the homework is submitted to review the material. Permission to remain in the course will depend on how well you have mastered the pre-requisite material. You may wish to consider taking CS 466 instead of this course (see http://tandy.cs.illinois.edu/CS466.html).

Course Textbook: The textbook is Computational Phylogenetics by Tandy Warnow, to be published in Spring 2017 by Cambridge University Press. Please let me know if you find any typos, since I can still make corrections.

Other course materials: Approximately the first half of the course is based on phylogenomics and multiple sequence alignment, and is based on the textbook. The second half of the course will cover genome assembly and annotation, comparative genomics, and metagenomics, and will be based on the scientific literature. You are expected to do all assigned reading (whether from the textbook or of published papers) in advance of coming to class.

Grading :

Homeworks: Homeworks need to be submitted to MOODLE in PDF format; these are due at 1 PM on the due date, which will generally be Tuesdays. Homeworks due on or before April 4 can be submitted up to 48 hours past the deadline for reduced credit (80\% if within 24 hours and 60\% if within 48 hours); homeworks due after April 4 must be submitted by the deadline for credit. The single worst homework grade will be dropped.

Final Project: The course requires a final project of each student, and is due in class on the last day the class meets. Please provide hardcopy to me directly - in class or in my office hours. You are strongly encouraged to do a research project, but you can also do a survey paper on some topic relevant to the course material. In both cases, your project should be a paper (of about 15 pages) in a format and style appropriate for submission to a journal. Research projects can involve two students, but survey papers must be done by yourself. Grades on the final project depend upon the kind of project you do. For a research paper, your grade will be 30% writing, 40% scientific/algorithmic rigor, and 30% impact. If you do a survey paper, the grade will be 30% writing, 30% summary of the literature you discuss, and 40% commentary (i.e., insight, critical and thoughtful discussion of the issues that come up). See this page for a list of possible final projects provided for this course in a previous year. Also, here is a list of external software that may be helpful to you in your projects, in case you want to do phylogenetic analyses of datasets and compare results.

Class Presentation: All students will present research papers from the recent scientific literature. The presentation of scientific papers is a major part of the course, and all students are expected to participate actively in discussing these papers.

Course Participation: Your course participation will be evaluated in terms of how you participate in the in-class discussions of the scientific literature we are reading, and also of the presentations of scientific papers given by the other students.

Academic integrity: You are expected to abide by the university academic integrity standards, which means (among other things) that you should never copy anyone else's homework nor let anyone copy your homework. This is particularly important for your final project, especially if you refer to the scientific literature in your project. You must also never plagiarize, which means (among other things) that any text that you copy from another document must be properly attributed (with quotation marks around the copied material, and citation to the document from which you have copied the material). Even paraphrasing can count as plagiarism. All violations of academic integrity standards will be reported to the appropriate university offices. Serious violations will result in a failing grade for the course. Please see this page for a brief discussion of this issue, and the real academic integrity page. The academic integrity code is applied to the homework assignments, as follows. You are encouraged to work with other students on the homework, but if you do this, this is what you should do. First, indicate on the homework who you worked with. Second, do not look at the other homework solutions when you write your own solutions; this includes not looking at someone else's write-up of a critique of some literature. Third, and more generally, you must write your homework solutions entirely on your own, using your own language. Please do not under any circumstances copy homework solutions from anyone else, or let anyone copy from you. Similarly, the academic integrity code is applied to your final project by the expectation that you will not copy text from any paper, and you will give appropriate credit to all material that you use from prior publications, websites, etc. It is particularly important to think about ethics in the context of your research. I have written up some scenarios challenging research integrity, but also see the interesting article in WIRED on The Young Billionaire Behind the War on Bad Science and the Retraction Watch website.

Emergency response recommendations Please see this webpage.

Additional reading: