Decomposing the site frequency spectrum: the impact of tree topology on neutrality tests

BioRxiv : the Preprint Server for Biology
L FerrettiT Wiehe

Abstract

We investigate the dependence of the site frequency spectrum (SFS) on the topological structure of genealogical trees. We show that basic population genetic statistics -- for instance estimators of θ or neutrality tests such as Tajima's D -- can be decomposed into components of waiting times between coalescent events and of tree topology. Our results clarify the relative impact of the two components on these statistics. We provide a rigorous interpretation of positive or negative values of neutrality tests in terms of the underlying tree shape. In particular, we show that values of Tajima's D and Fay and Wu's H depend in a direct way on a measure of tree balance which is mostly determined by the root balance of the tree. We also compute the maximum and minimum values for neutrality tests as a function of sample size. Focusing on the standard coalescent model of neutral evolution, we discuss how waiting times between coalescent events are related to derived allele frequencies and thereby to the frequency spectrum. Finally, we show how tree balance affects the frequency spectrum. In particular, we derive the complete SFS conditioned on the root imbalance. We show that the conditional spectrum is peaked at frequencies correspondin...Continue Reading

Related Concepts

Alleles
Anatomy, Regional
Equilibrium
Neutralization Tests
Trees (plant)
Site
Plant Roots
Shapes
Size
Structure

Related Feeds

BioRxiv & MedRxiv Preprints

BioRxiv and MedRxiv are the preprint servers for biology and health sciences respectively, operated by Cold Spring Harbor Laboratory. Here are the latest preprint articles (which are not peer-reviewed) from BioRxiv and MedRxiv.