Apr 5, 2014

fastSTRUCTURE: variational inference of population structure in large SNP data sets

Genetics
Anil RajJonathan K Pritchard

Abstract

Tools for estimating population structure from genetic data are now used in a wide variety of applications in population genetics. However, inferring population structure in large modern data sets imposes severe computational challenges. Here, we develop efficient algorithms for approximate inference of the model underlying the STRUCTURE program using a variational Bayesian framework. Variational methods pose the problem of computing relevant posterior distributions as an optimization problem, allowing us to build on recent advances in optimization theory to develop fast inference tools. In addition, we propose useful heuristic scores to identify the number of populations represented in a data set and a new hierarchical prior to detect weak population structure in the data. We test the variational algorithms on simulated data and illustrate using genotype data from the CEPH-Human Genome Diversity Panel. The variational algorithms are almost two orders of magnitude faster than STRUCTURE and achieve accuracies comparable to those of ADMIXTURE. Furthermore, our results show that the heuristic scores for choosing model complexity provide a reasonable range of values for the number of populations represented in the data, with minima...Continue Reading

  • References
  • Citations201

References

  • We're still populating references for this paper, please check back later.
  • References
  • Citations201

Citations

Mentioned in this Paper

Heuristics
Genetics, Population
Bayesian Prediction
Single Nucleotide Polymorphism
Posterior Pituitary Disease
Genome, Human
Genotype Determination

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Synthetic Genetic Array Analysis

Synthetic genetic arrays allow the systematic examination of genetic interactions. Here is the latest research focusing on synthetic genetic arrays and their analyses.

Autism: Motor Learning

A common feature of autism spectrum disorder (ASD) is the impairment of motor control and learning, consistent with perturbation in cerebellar function. Find the latest research on ASD and motor learning here.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

Sexual Dimorphism in Neurodegeneration

There exist sex differences in neurodevelopmental and neurodegenerative disorders. For instance, multiple sclerosis is more common in women, whereas Parkinson’s disease is more common in men. Here is the latest research on sexual dimorphism in neurodegeneration

Protein Localization in Disease & Therapy

Localization of proteins is critical for ensuring the correct location for physiological functioning. If an error occurs, diseases such as cardiovascular, neurodegenerative disorders and cancers can present. Therapies are being explored to target this mislocalization. Here is the latest research on protein localization in disease and therapy.

Genetic Screens in Bacteria

Genetic screens can provide important information on gene function as well as the molecular events that underlie a biological process or pathway. Here is the latest research on genetic screens in bacteria.

Head And Neck Squamous Cell Carcinoma

Squamous cell carcinomas account for >90% of all tumors in the head and neck region. Head and neck squamous cell carcinoma incidence has increased dramatically recently with little improvement in patient outcomes. Here is the latest research on this aggressive malignancy.

Artificial Intelligence in Cardiac Imaging

Artificial intelligence (ai) techniques are increasingly applied to cardiovascular (cv) medicine in cardiac imaging analysis. Here is the latest research.