Uncovering missed indels by leveraging unmapped reads

Scientific Reports
Mohammad Shabbir HasanLiqing Zhang

Abstract

In current practice, Next Generation Sequencing (NGS) applications start with mapping/aligning short reads to the reference genome, with the aim of identifying genetic variants. Although existing alignment tools have shown great accuracy in mapping short reads to the reference genome, a significant number of short reads still remain unmapped and are often excluded from downstream analyses thereby causing nonnegligible information loss in the subsequent variant calling procedure. This paper describes Genesis-indel, a computational pipeline that explores the unmapped reads to identify novel indels that are initially missed in the original procedure. Genesis-indel is applied to the unmapped reads of 30 breast cancer patients from TCGA. Results show that the unmapped reads are conserved between the two subtypes of breast cancer investigated in this study and might contribute to the divergence between the subtypes. Genesis-indel identifies 72,997 novel high-quality indels previously not found, among which 16,141 have not been annotated in the widely used mutation database. Statistical analysis of these indels shows significant enrichment of indels residing in oncogenes and tumour suppressor genes. Functional annotation further revea...Continue Reading

References

Dec 1, 1991·Proceedings of the National Academy of Sciences of the United States of America·H MiyoshiM Ohki
Mar 4, 2000·Journal of Molecular Biology·S A Teichmann, C Chothia
Apr 5, 2002·Genome Research·W James Kent
Jan 2, 2003·Nature Reviews. Cancer·Julian Downward
Jan 30, 2003·Oncogene·Fernando P G SilvaMicheline Giphart-Gassler
Mar 6, 2003·Human Mutation·Anne-Lise Børresen-Dale
Sep 17, 2003·Journal of Clinical Oncology : Official Journal of the American Society of Clinical Oncology·Wen-Kai Weng, Ronald Levy
Mar 17, 2004·Cancer Treatment Reviews·Juan Angel Fresno VaraManuel González-Barón
Oct 16, 2004·Journal of Clinical Oncology : Official Journal of the American Society of Clinical Oncology·Wen-Kai WengRonald Levy
Jan 22, 2005·The New England Journal of Medicine·Brunangelo FaliniUNKNOWN GIMEMA Acute Leukemia Working Party
Feb 14, 2006·Methods in Enzymology·Rüdiger NeefFrancis A Barr
Aug 22, 2006·Journal of Clinical Oncology : Official Journal of the American Society of Clinical Oncology·Peter PaschkaUNKNOWN Cancer and Leukemia Group B
Nov 8, 2006·Nucleic Acids Research·Maureen E HigginsAlex E Lash
Dec 6, 2007·BMC Bioinformatics·Yuanyuan DingDawn Wilkins
Jan 30, 2008·Bioinformatics·Ruiqiang LiJun Wang
Mar 19, 2008·Journal of Clinical Oncology : Official Journal of the American Society of Clinical Oncology·Antonino MusolinoAndrea Ardizzoni
May 7, 2008·Journal of Clinical Oncology : Official Journal of the American Society of Clinical Oncology·Lecia V SequistThomas J Lynch
Mar 6, 2009·Genome Biology·Ben LangmeadSteven L Salzberg
May 20, 2009·Bioinformatics·Heng Li, Richard Durbin
Jun 6, 2009·Bioinformatics·Ruiqiang LiJun Wang
Jun 10, 2009·Bioinformatics·Heng LiUNKNOWN 1000 Genome Project Data Processing Subgroup
Jul 29, 2009·Nature Reviews. Genetics·Vivian G Cheung, Richard S Spielman
Nov 20, 2009·The FEBS Journal·Tetsuya Mitsudomi, Yasushi Yatabe
Jan 30, 2010·Bioinformatics·Aaron R Quinlan, Ira M Hall
Jun 4, 2010·Briefings in Bioinformatics·Daniel C KoboldtRichard K Wilson
Sep 2, 2010·Human Molecular Genetics·Daniel G MacArthur, Chris Tyler-Smith
Sep 23, 2010·Human Molecular Genetics·Julienne M MullaneyScott E Devine
Oct 26, 2011·Cell Cycle·Kevin A Janes
Nov 8, 2011·The American Journal of Pathology·Isabelle GasnereauJoëlle Sobczak-Thépot
Mar 6, 2012·Nature Methods·Ben Langmead, Steven L Salzberg
Apr 12, 2012·Nucleic Acids Research·Prathima Iengar
Apr 21, 2012·Briefings in Bioinformatics·Helga ThorvaldsdóttirJill P Mesirov
Jun 6, 2012·Annual Review of Biochemistry·John L Rinn, Howard Y Chang

❮ Previous
Next ❯

Citations

Sep 12, 2020·Human Genetics·Che-Yu LeeHanlin Gao

❮ Previous
Next ❯

Methods Mentioned

BETA
PCR

Software Mentioned

Trimmomatic
MAQ
Bowtie2
BWA
SAMtools
Platypus
Integrated Genome Viewer ( IGV
- indel
Ensembl
SOAP2

Related Concepts

Related Feeds

Cancer Sequencing

Several sequencing approaches are employed to understand and examine tumor development and progression. These include whole genome as well as RNA sequencing. Here is the latest research on cancer sequencing.