IIBR Informatics: Advancing Bioinformatics Methods using Ensembles of Profile Hidden Markov Models

Participants:

Funding: U.S. National Science Foundation grant DBI-2006069 (ABI Innovation), $500,000.

Project Overview: Profile Hidden Markov Models (i.e., profile HMMs) are probabilistic graphical models that are in wide use in bioinformatics. Research over the last decade has shown that ensembles of profile HMMs (e-HMMs) can provide greater accuracy than a single profile HMM for many applications in bioinformatics, including phylogenetic placement, multiple sequence alignment, and taxonomic identification of metagenomic reads. Although these improvements have been substantial, the design of these e-HMMs has been fairly ad hoc, and their use can be computationally intensive, which reduces their appeal in practice. This project advances the use of e-HMMs by developing statistically rigorous techniques for building e-HMMs with the goal of improving accuracy and improving understanding of e-HMMs, and develops methods that use e-HMMs in different bioinformatics problems. Broader impacts include software schools, engagement with under-represented groups, and open-source software.

Journal publications supported by this grant:

Preprints supported by this grant (not otherwise published)

Project Software:

Symposia and Software Schools: The grant will provide symposia and software schools to train researchers (from students through faculty) in new methods. We will hold a Phylogenomics Software School as part of the Joint Congrees on Evolutionary Biology in Athens Georgia, on June 20, 2025.

Presentations: See http://tandy.cs.illinois.edu/talks.html for the full list of talks.

Course materials My CS 581: Algorithmic Genomic Biology course covers this material, as well as related material. The lectures are available for download, mostly in PDF format.

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.