Abstract

We present a method that jointly analyzes the polymorphism and divergence sites in genomic sequences of multiple species to identify the genes under natural selection and pinpoint the occurrence time of selection to a specific lineage of the species phylogeny. This method integrates population genetics models using a Bayesian Poisson random field framework and combines information over all gene loci to boost the power for detecting selection. The method provides posterior distributions of the fitness effects of each gene along with parameters associated with the evolutionary history, including the species divergence time and effective population size of external species. The results of simulations demonstrate that our method achieves a high power to identify genes under positive selection for a wide range of selection intensity and provides reasonably accurate estimates of the population genetic parameters. The proposed method is applied to genomic sequences of humans, chimpanzees, gorillas, and orangutans and identifies a list of lineage-specific targets of positive selection. The positively selected genes in the human lineage are enriched in pathways of gene expression regulation, immune system and metabolism, etc. Our analysis provides insights into natural evolution in the coding regions of humans and great apes and thus serves as a basis for further molecular and functional studies.

Source link