Modeling SNP array ascertainment with Approximate Bayesian Computation for demographic inference

Scientific Reports
Consuelo D Quinto-CortésMichael F Hammer


Single nucleotide polymorphisms (SNPs) in commercial arrays have often been discovered in a small number of samples from selected populations. This ascertainment skews patterns of nucleotide diversity and affects population genetic inferences. We propose a demographic inference pipeline that explicitly models the SNP discovery protocol in an Approximate Bayesian Computation (ABC) framework. We simulated genomic regions according to a demographic model incorporating parameters for the divergence of three well-characterized HapMap populations and recreated the SNP distribution of a commercial array by varying the number of haploid samples and the allele frequency cut-off in the given regions. We then calculated summary statistics obtained from both the ascertained and genomic data and inferred ascertainment and demographic parameters. We implemented our pipeline to study the admixture process that gave rise to the present-day Mexican population. Our estimate of the time of admixture is closer to the historical dates than those in previous works which did not consider ascertainment bias. Although the use of whole genome sequences for demographic inference is becoming the norm, there are still underrepresented areas of the world fr...Continue Reading


Aug 30, 2001·Trends in Genetics : TIG·David Reich, E S Lander
Nov 13, 2001·American Journal of Human Genetics·J WakeleyK Ardlie
Apr 12, 2003·Theoretical Population Biology·Rasmus Nielsen, James Signorovitch
Dec 20, 2003·Nature·International HapMap Consortium
Oct 23, 2004·Science·ENCODE Project Consortium
Dec 14, 2004·Human Genomics·Rasmus Nielsen
Oct 28, 2005·Genome Research·Andrew G ClarkRasmus Nielsen
May 16, 2007·American Journal of Human Genetics·Alkes L PriceDavid Reich
Mar 15, 2008·Science·Ted GoebelDennis H O'Rourke
Nov 26, 2008·Genome Research·Gary K ChenJeffrey D Wall
Apr 29, 2009·PloS One·Franziska B MullauerJan Paul Medema
Nov 17, 2009·BMC Genetics·Daniel Garrigan
Feb 25, 2010·Current Biology : CB·Dennis H O'Rourke, Jennifer A Raff
Mar 6, 2010·BMC Bioinformatics·Daniel WegmannLaurent Excoffier
Jun 19, 2010·Molecular Biology and Evolution·Anders AlbrechtsenRasmus Nielsen
Nov 16, 2010·Current Biology : CB·Andreas WollsteinManfred Kayser
Jul 7, 2011·Proceedings of the National Academy of Sciences of the United States of America·Simon GravelCarlos D Bustamante
Dec 24, 2011·PLoS Genetics·Nicholas A JohnsonHua Tang
Feb 9, 2012·Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences·Stephen Oppenheimer
Nov 7, 2012·Nature·Goncalo R AbecasisGil A McVean
Nov 16, 2012·BMC Bioinformatics·Leonardo ArbizaAlon Keinan
May 15, 2013·PLoS Biology·Peter Ralph, Graham Coop
Jul 10, 2013·BioEssays : News and Reviews in Molecular, Cellular and Developmental Biology·Joseph Lachance, Sarah A Tishkoff
Jan 5, 2014·PLoS Genetics·Simon GravelCarlos D Bustamante
Apr 3, 2015·Investigative Genetics·Irina Pugach, Mark Stoneking
Jun 23, 2015·Nature·Qiaomei FuSvante Pääbo
Nov 25, 2015·Nature·Iain MathiesonDavid Reich


Mar 7, 2019·Molecular Biology and Evolution·Ariella L Gladstein, Michael F Hammer
Mar 31, 2021·PloS One·Johannes GeibelHenner Simianer

Methods Mentioned


Related Concepts

Whole Genome Sequencing
Bayesian Prediction
Single Nucleotide Polymorphism
African Continental Ancestry Group
European Continental Ancestry Group
Korean Race
Base Sequence

Trending Feeds


Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Alzheimer's Disease: MS4A

Variants within the membrane-spanning 4-domains subfamily A (MS4A) gene cluster have recently been implicated in Alzheimer's disease in genome-wide association studies. Here is the latest research on Alzheimer's disease and MS4A.

Pediculosis pubis

Pediculosis pubis is a disease caused by a parasitic insect known as Pthirus pubis, which infests human pubic hair, as well as other areas with hair including eye lashes. Here is the latest research.

Rh Isoimmunization

Rh isoimmunization is a potentially preventable condition that occasionally is associated with significant perinatal morbidity or mortality. Discover the latest research on Rh Isoimmunization here.

Genetic Screens in iPSC-derived Brain Cells

Genetic screening is a critical tool that can be employed to define and understand gene function and interaction. This feed focuses on genetic screens conducted using induced pluripotent stem cell (iPSC)-derived brain cells. It also follows CRISPR-Cas9 approaches to generating genetic mutants as a means of understanding the effect of genetics on phenotype.

Enzyme Evolution

This feed focuses on molecular models of enzyme evolution and new approaches (such as adaptive laboratory evolution) to metabolic engineering of microorganisms. Here is the latest research.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Pharmacology of Proteinopathies

This feed focuses on the pharmacology of proteinopathies - diseases in which proteins abnormally aggregate (i.e. Alzheimer’s, Parkinson’s, etc.). Discover the latest research in this field with this feed.

Alignment-free Sequence Analysis Tools

Alignment-free sequence analyses have been applied to problems ranging from whole-genome phylogeny to the classification of protein families, identification of horizontally transferred genes, and detection of recombined sequences. Here is the latest research.