The Multispecies Network Coalescent and Phylogenetic Network Inference

Luay Nakhleh
Department of Computer Science
Rice University

The multispecies coalescent (MSC) model has emerged as the main stochastic process that helps capture the intricate relationship between species trees and gene trees. Combined with models of sequence evolution, the MSC can be viewed as a generative model of genomic sequence data in the context of a (species) phylogenetic tree. In particular, the MSC naturally explains and allows for quantifying the phenomenon of incomplete lineage sorting (ILS).

The use of genome-wide data in evolutionary analysis has yielded increasing evidence for reticulation during the evolution of various groups of eukaryotic species (reticulation has long been acknowledged as a major evolutionary process in prokaryotes, but not so in eukaryotes!). Reticulate evolutionary histories are best represented as phylogenetic networks, which extend the tree model to allow for admixtures of genetic material.

In this talk, I will describe the multispecies network coalescent (MSNC) model, which extends the MSC model so that it operates within the branches of a phylogenetic network. This extended model naturally allows for modeling vertical and horizontal evolutionary processes acting within and across species boundaries. In particular, it simultaneously accounts for gene tree incongruence across loci due to both hybridization and incomplete lineage sorting. I will then describe a likelihood function for this model, as well as a method for Bayesian sampling of phylogenetic networks and their parameters using reversible-jump Markov chain Monte Carlo (RJMCMC). All the methods I describe have been implemented in our open-source software package, PhyloNet, which is publicly available at