Phasing of single DNA molecules by massively parallel barcoding

Nature Communications
Erik BorgströmAfshin Ahmadian

Abstract

High-throughput sequencing platforms mainly produce short-read data, resulting in a loss of phasing information for many of the genetic variants analysed. For certain applications, it is vital to know which variant alleles are connected to each individual DNA molecule. Here we demonstrate a method for massively parallel barcoding and phasing of single DNA molecules. First, a primer library with millions of uniquely barcoded beads is generated. When compartmentalized with single DNA molecules, the beads can be used to amplify and tag any target sequences of interest, enabling coupling of the biological information from multiple loci. We apply the assay to bacterial 16S sequencing and up to 94% of the hypothesized phasing events are shown to originate from single molecules. The method enables use of widely available short-read-sequencing platforms to study long single molecules within a complex sample, without losing phase information.

References

Sep 2, 2004·Cold Spring Harbor Symposia on Quantitative Biology·J B FanM S Chee
Apr 4, 2006·Nucleic Acids Research·Erik PetterssonAfshin Ahmadian
Dec 2, 2006·Science·Elizabeth A OttesenJared R Leadbetter
Nov 22, 2008·Science·John EidStephen Turner
Apr 8, 2009·Nature Nanotechnology·James ClarkeHagan Bayley
Jan 19, 2010·Nature Methods·Joseph B HiattJay Shendure
Dec 21, 2010·Nature Biotechnology·Jacob O KitzmanJay Shendure
Feb 9, 2011·Nature Reviews. Genetics·Ryan TewheyNicholas J Schork
Mar 17, 2011·PloS One·Mårten NeimanAfshin Ahmadian
Apr 15, 2011·Nucleic Acids Research·James A CasbonConrad P Lichtenstein
Sep 17, 2011·Nature Reviews. Genetics·Sharon R Browning, Brian L Browning
Nov 22, 2011·Nature Methods·Teemu KiviojaJussi Taipale
Jul 13, 2012·Nature·Brock A PetersRadoje Drmanac
Jul 11, 2013·ELife·Ayelet VoskoboynikStephen R Quake
Oct 11, 2013·Nature Biotechnology·Donald SharonMichael Snyder

Citations

Feb 2, 2016·Nature Biotechnology·Grace X Y ZhengHanlee P Ji
Sep 26, 2015·International Journal of Molecular Sciences·HyeonSeok Shin, Byung-Kwan Cho
Apr 20, 2016·Advanced Drug Delivery Reviews·Dmitriy KhodakovDavid Yu Zhang
Jun 1, 2016·Frontiers in Microbiology·Yue O O HuAnders F Andersson
Dec 13, 2016·Nucleic Acids Research·Marie-Jeanne ArguelRainer Waldmann
May 20, 2017·Nucleic Acids Research·David RedinAfshin Ahmadian
Oct 5, 2017·Nucleic Acids Research·John M BellHanlee P Ji
Dec 4, 2019·Scientific Reports·David RedinAfshin Ahmadian
Oct 27, 2018·Lab on a Chip·Iain C Clark, Adam R Abate
Feb 16, 2019·European Journal of Human Genetics : EJHG·Ólavur MortensenNoomi Oddmarsdóttir Gregersen
Nov 28, 2020·Nature Communications·Paul Jannis ZurekFlorian Hollfelder

Datasets Mentioned

BETA
SRA248941

Methods Mentioned

BETA
genotyping
PCR
single-cell sequencing
454 Sequencing
fluorescence activated cell sorting
electrophoresis
FACS

Related Concepts

DNA, Double-Stranded
DNA Barcode, Taxonomic
DNA
Oligonucleotide Primers
Gene Mutant
BAT Loci
Analysis
Nucleic Acid Sequencing
Genome Sequencing
Parallel Study

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Lipidomics & Rhinovirus Infection

Lipidomics can be used to examine the lipid species involved with pathogenic conditions, such as viral associated inflammation. Discovered the latest research on Lipidomics & Rhinovirus Infection.

Alzheimer's Disease: MS4A

Variants within the membrane-spanning 4-domains subfamily A (MS4A) gene cluster have recently been implicated in Alzheimer's disease in genome-wide association studies. Here is the latest research on Alzheimer's disease and MS4A.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Torsion Dystonia

Torsion dystonia is a movement disorder characterized by loss of control of voluntary movements appearing as sustained muscle contractions and/or abnormal postures. Here is the latest research.

Generating Insulin-Secreting Cells

Reprogramming cells or using induced pluripotent stem cells to generate insulin-secreting cells has significant therapeutic implications for diabetics. Here is the latest research on generation of insulin-secreting cells.

Central Pontine Myelinolysis

Central Pontine Myelinolysis is a neurologic disorder caused most frequently by rapid correction of hyponatremia and is characterized by demyelination that affects the central portion of the base of the pons. Here is the latest research on this disease.

Epigenome Editing

Epigenome editing is the directed modification of epigenetic marks on chromatin at specified loci. This tool has many applications in research as well as in the clinic. Find the latest research on epigenome editing here.