Deep learning for population genetic inference

BioRxiv : the Preprint Server for Biology
Sara Sheehan, Yun S. Song

Abstract

Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce here a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. In contrast to Approximate Bayesian Computation, another likelihood-free approach widely used in population genetics and other fields, deep learning does not require a distance function on summary statistics or a rejection step, and it is robust to the addition of uninformative statistics. To demonstrate that deep learning can be effectively employed to estimate population genetic parameters and learn informative features of data, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history). Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer bot...Continue Reading

Related Concepts

Demographic Analysis
Study
Genome
Drosophila
Centromere
Equilibrium
Genomics
Chromosomes
Bnk protein, Drosophila
Local

Related Feeds

BioRxiv & MedRxiv Preprints

BioRxiv and MedRxiv are the preprint servers for biology and health sciences respectively, operated by Cold Spring Harbor Laboratory. Here are the latest preprint articles (which are not peer-reviewed) from BioRxiv and MedRxiv.