npInv: accurate detection and genotyping of inversions using long read sub-alignment

BMC Bioinformatics
Haojing ShaoLachlan J M Coin

Abstract

Detection of genomic inversions remains challenging. Many existing methods primarily target inzversions with a non repetitive breakpoint, leaving inverted repeat (IR) mediated non-allelic homologous recombination (NAHR) inversions largely unexplored. We present npInv, a novel tool specifically for detecting and genotyping NAHR inversion using long read sub-alignment of long read sequencing data. We benchmark npInv with other tools in both simulation and real data. We use npInv to generate a whole-genome inversion map for NA12878 consisting of 30 NAHR inversions (of which 15 are novel), including all previously known NAHR mediated inversions in NA12878 with flanking IR less than 7kb. Our genotyping accuracy on this dataset was 94%. We used PCR to confirm the presence of two of these novel inversions. We show that there is a near linear relationship between the length of flanking IR and the minimum inversion size, without inverted repeats. The application of npInv shows high accuracy in both simulation and real data. The results give deeper insight into understanding inversion.

References

Jan 18, 2005·Nature Genetics·Hreinn StefanssonKari Stefansson
Jun 28, 2005·Nucleic Acids Research·Laurent Noé, Gregory Kucherov
Jan 19, 2006·Nature Reviews. Genetics·Lars FeukStephen W Scherer
Sep 1, 1917·Proceedings of the National Academy of Sciences of the United States of America·A H Sturtevant
Jun 14, 2006·Nature Reviews. Genetics·Jeffrey A Bailey, Evan E Eichler
Dec 23, 2006·Genome Research·Vikas BansalVineet Bafna
May 3, 2008·Nature·Jeffrey M KiddEvan E Eichler
Sep 24, 2008·Trends in Genetics : TIG·Mitch McVey, Sang Eun Lee
Oct 9, 2008·PloS One·Daniel C RichterDaniel H Huson
Nov 7, 2008·Nature·Jun WangJian Wang
Jun 10, 2009·Bioinformatics·Heng LiUNKNOWN 1000 Genome Project Data Processing Subgroup
Apr 10, 2010·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Suzanne S Sindi, Benjamin J Raphael
May 21, 2010·Genome Biology·Andy W PangStephen W Scherer
Jan 7, 2011·Genome Research·Szymon M KiełbasaMartin C Frith
Jun 10, 2011·Bioinformatics·Petr DanecekUNKNOWN 1000 Genomes Project Analysis Group
Nov 30, 2011·Nature Reviews. Genetics·Todd J Treangen, Steven L Salzberg
Feb 11, 2012·BMC Bioinformatics·Alejandro CáceresJuan R González
Jun 28, 2014·Genome Biology·Ryan M LayerIra M Hall
Sep 10, 2014·Current Protocols in Bioinformatics·Aaron R Quinlan
Feb 17, 2015·Nature Methods·Miten JainMark Akeson
Oct 4, 2015·Nature·Peter H SudmantJan O Korbel
Apr 16, 2016·Nature Communications·Ivan SovićNiranjan Nagarajan
Jun 25, 2017·Bioinformatics·Jake R ConwayNils Gehlenborg
Feb 13, 2018·Nature Biotechnology·Miten JainMatthew Loose
May 2, 2018·Nature Methods·Fritz J SedlazeckMichael C Schatz
May 12, 2018·Bioinformatics·Heng Li

❮ Previous
Next ❯

Citations

Jun 13, 2019·Genome Research·Wouter De CosterChristine Van Broeckhoven
May 23, 2019·Genome Biology and Evolution·Vinicius H da SilvaMartien A M Groenen
Nov 16, 2019·Nature Reviews. Genetics·Steve S HoRyan E Mills
Mar 16, 2019·Acta Neuropathologica·Rita CacaceUNKNOWN BELNEU Consortium
May 1, 2019·Development, Growth & Differentiation·Nobuaki Kono, Kazuharu Arakawa
Sep 19, 2019·Nature Communications·Carla Giner-DelgadoMario Cáceres
May 20, 2020·Molecular Biology Reports·Guannan Wang, Saraswati Sukumar
Aug 5, 2020·Genome Biology·Tao JiangYadong Wang
Jun 17, 2020·Nature Genetics·David PorubskyEvan E Eichler
Oct 31, 2020·Science·Neil F ThompsonJohn Carlos Garza
Dec 7, 2021·Frontiers in Genetics·Davide Bolognini, Alberto Magi

❮ Previous
Next ❯

Datasets Mentioned

BETA
NA12878

Methods Mentioned

BETA
genotyping
PCR
electrophoresis

Software Mentioned

BEDTools
mpileup
Validated
npInv
FoSTeS
NGMLR
BLASR
UpSetR
Pacbio
Sniffles

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Blastomycosis

Blastomycosis fungal infections spread through inhaling Blastomyces dermatitidis spores. Discover the latest research on blastomycosis fungal infections here.

Nuclear Pore Complex in ALS/FTD

Alterations in nucleocytoplasmic transport, controlled by the nuclear pore complex, may be involved in the pathomechanism underlying multiple neurodegenerative diseases including Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. Here is the latest research on the nuclear pore complex in ALS and FTD.

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Microbicide

Microbicides are products that can be applied to vaginal or rectal mucosal surfaces with the goal of preventing, or at least significantly reducing, the transmission of sexually transmitted infections. Here is the latest research on microbicides.