ACM Fellow (2016): For contributions to mathematical theory, algorithms, and software for large-scale molecular phylogenetics and historical linguistics.

News: My former student Siavash Mirarab received honorable mention for ACM Doctoral Dissertation Award, 2016 (UCSD announcement) (ACM announcement).

Postdoctoral Position Available: I have an opening in my group for a postdoc to work on developing computational methods for large-scale multiple sequence alignment, phylogeny estimation, or metagenomics. Prior publications in these problems, strong programming skills, and interest in collaboration, is necessary. If you are interested in applying, please contact me.

computer Phylogenomics Symposium and Software School, June 16-17, 2016, in Austin Texas (part of Evolution 2016)

Research Overview: My research combines mathematics, computer science, probability, and statistics, in order to develop algorithms with improved accuracy for large-scale and complex estimation problems in phylogenomics (genome-scale phylogeny estimation), multiple sequence alignment, and metagenomics. I work especially on the hardest computational problems in these areas, where large dataset sizes and model complexity makes existing approaches have insufficient accuracy. For these problems, I develop innovative strategies (often including graph-theoretic algorithms that employ divide-and-conquer, combined with powerful statistical methods), and prove theorems about the methods we develop. I also work in Historical Linguistics, which seeks to estimate how language families (e.g., Indo-European) evolved. We use real data and perform massive simulations to evaluate the performance of methods that we develop, and also collaborate closely with biologists and linguists in data analysis. Our current collaborations include the 1KP (Thousand Transcriptome Project) and the Avian Phylogenomics Project. These collaborations include data analysis and the development of new methods for estimating alignments and trees (both gene trees and species trees). We welcome collaborations with biologists who have data that are difficult to analyze, either because the datasets are too large for current methods, or because current methods fail to have sufficiently high accuracy. As an example of my work, please see my SMBE 2015 talk, where I talked about the new methods we are developing for estimating species trees from genome-scale data, even in the presence of massive gene tree discord, and that have high accuracy on very large datasets (1000 species and 1000 genes). News Articles

Spring 2016 graduate course, 598 Algorithmic Computational Genomics:. Tuesdays and Thursdays, 11:00 AM - 12:20 PM, 1109 Siebel Center for Computer Science. The purpose of the course is to give each student enough background and training in the area of algorithmic genomic biology so that you will be able to do research in this area, and publish papers. Every year, two or more students from this course have done final projects that were subsequently published in major scientific journals; you can be one of them! The main focus of the course is on phylogeny (evolutionary tree) estimation, but the course also covers the related problems of computing multiple sequence alignments, analyzing metagenomes, and even historical linguistics. Students will learn the mathematical and computational foundations in these areas, read the current literature, and do a team research project. The course is designed for doctoral students in computer science, computer engineering, bioengineering, mathematics, and statistics, and does not depend on any prior background in biology. Please see 598AGB webpage for more information.

If you are interested in joining my lab, please read this, and then contact me.

Current Research and NSF Funding: My research in currently funded by four grants from the National Science Foundation.

I also recently benefited from support of the John P. Simon Guggenheim Foundation, and earlier support from the David and Lucile Packard Foundation, the Radcliffe Institute for Advanced Study at Harvard University, the Program for Evolutionary Dynamics at Harvard University, and Microsoft Research, New England. The Founder Professorship is funded through the Grainger Engineering Breakthroughs Initiative, which is supporting development of research in Big Data and Bioengineering at UIUC. I am grateful to the National Science Foundation for its continuous support since 1994. See this page for my funding since 2001.


"Plus de détails, plus de détails, disait-il à son fils, il n'y a d'originalité et de vérité que dans les détails..." -- Stendhal, Lucien Leuwen (a quote much loved by my stepfather, Martin J. Klein, and an essential guide for all scholarship).

Click here for Google Scholar Citations (i10-index 120 and h-index 50).




For prospective students and postdocs

Current and former students and postdocs

Prior courses: CS 173(b) and CS 598AGB

Recent Symposia and Software Schools 

CIPRES

Personal

Conference Calendar





Downloadable papers

Complete vita and publication list

Brief vita

Software

Former lab website 

How to write your first paper

Seminar Talks (2015-present)

Contact info