Abstract

Molecular phylogenetics has neglected polymorphisms within present and ancestral populations for a long time. Recently, multispecies coalescent based methods have increased in popularity, however, their application is limited to a small number of species and individuals. We introduced a polymorphism-aware phylogenetic model (PoMo), which overcomes this limitation and scales well with the increasing amount of sequence data whereas accounting for present and ancestral polymorphisms. PoMo circumvents handling of gene trees and directly infers species trees from allele frequency data. Here, we extend the PoMo implementation in IQ-TREE and integrate search for the statistically best-fit mutation model, the ability to infer mutation rate variation across sites, and assessment of branch support values. We exemplify an analysis of a hundred species with ten haploid individuals each, showing that PoMo can perform inference on large data sets. While PoMo is more accurate than standard substitution models applied to concatenated alignments, it is almost as fast. We also provide bmm-simulate, a software package that allows simulation of sequences evolving under PoMo. The new options consolidate the value of PoMo for phylogenetic analyses with population data.

Source link