by Stilianos Louca, Florent Mazel, Michael Doebeli, Laura Wegener Parfrey
The global diversity of Bacteria and Archaea, the most ancient and most widespread forms of life on Earth, is a subject of intense controversy. This controversy stems largely from the fact that existing estimates are entirely based on theoretical models or extrapolations from small and biased data sets. Here, in an attempt to census the bulk of Earth’s bacterial and archaeal (“prokaryotic”) clades and to estimate their overall global richness, we analyzed over 1.7 billion 16S ribosomal RNA amplicon sequences in the V4 hypervariable region obtained from 492 studies worldwide, covering a multitude of environments and using multiple alternative primers. From this data set, we recovered 739,880 prokaryotic operational taxonomic units (OTUs, 16S-V4 gene clusters at 97% similarity), a commonly used measure of microbial richness. Using several statistical approaches, we estimate that there exist globally about 0.8–1.6 million prokaryotic OTUs, of which we recovered somewhere between 47%–96%, representing >99.98% of prokaryotic cells. Consistent with this conclusion, our data set independently “recaptured” 91%–93% of 16S sequences from multiple previous global surveys, including PCR-independent metagenomic surveys. The distribution of relative OTU abundances is consistent with a log-normal model commonly observed in larger organisms; the total number of OTUs predicted by this model is also consistent with our global richness estimates. By combining our estimates with the ratio of full-length versus partial-length (V4) sequence diversity in the SILVA sequence database, we further estimate that there exist about 2.2–4.3 million full-length OTUs worldwide. When restricting our analysis to the Americas, while controlling for the number of studies, we obtain similar richness estimates as for the global data set, suggesting that most OTUs are globally distributed. Qualitatively similar results are also obtained for other 16S similarity thresholds (90%, 95%, and 99%). Our estimates constrain the extent of a poorly quantified rare microbial biosphere and refute recent predictions that there exist trillions of prokaryotic OTUs.