New methods for estimating very large multiple sequence alignments and trees with up to 1,000,000 sequences

Tandy Warnow
Department of Computer Science
The University of Illinois at Urbana-Champaign

Multiple sequence alignment is a basic bioinformatics task with many downstream applications, including phylogeny estimation. In this talk I will describe two methods we have developed to improve multiple sequence alignment, including PASTA (the successor to SATé) and UPP. Both methods are able to align ultra-large datasets with up to 1,000,000 sequences with high accuracy, and do so very quickly.

PASTA and UPP are both joint work with my former students Siavash Mirarab and Nam-phuong Nguyen.