Dot2dot: accurate whole-genome tandem repeats discovery

Bioinformatics
Loredana M GenoveseFilippo Geraci

Abstract

Large-scale sequencing projects have confirmed the hypothesis that eukaryotic DNA is rich in repetitions whose functional role needs to be elucidated. In particular, tandem repeats (TRs) (i.e. short, almost identical sequences that lie adjacent to each other) have been associated to many cellular processes and, indeed, are also involved in several genetic disorders. The need of comprehensive lists of TRs for association studies and the absence of a computational model able to capture their variability have revived research on discovery algorithms. Building upon the idea that sequence similarities can be easily displayed using graphical methods, we formalized the structure that TRs induce in dot-plot matrices where a sequence is compared with itself. Leveraging on the observation that a compact representation of these matrices can be built and searched in linear time, we developed Dot2dot: an accurate algorithm fast enough to be suitable for whole-genome discovery of TRs. Experiments on five manually curated collections of TRs have shown that Dot2dot is more accurate than other established methods, and completes the analysis of the biggest known reference genome in about one day on a standard PC. Source code and datasets are fre...Continue Reading

References

Mar 1, 1970·Journal of Molecular Biology·S B Needleman, C D Wunsch
Dec 24, 1998·Nucleic Acids Research·G Benson
Jan 11, 2000·Nucleic Acids Research·C M RuitbergJ M Butler
May 15, 2001·The EMBO Journal·E VigueraS D Ehrlich
Nov 20, 2001·Nucleic Acids Research·S KurtzR Giegerich
May 23, 2002·Bioinformatics·Adalberto T CasteloGuang R Gao
Feb 18, 2003·TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik·T ThielA Graner
Jun 26, 2003·Nucleic Acids Research·Roman KolpakovGregory Kucherov
Sep 27, 2003·Bioinformatics·Valerio ParisiFilippo Aluffi-Pentini
Dec 19, 2003·Nucleic Acids Research·Donna KarolchikW James Kent
May 18, 2004·Bioinformatics·Arun Krishnan, Francis Tang
Jun 8, 2004·Bioinformatics·Olivier Delgrange, Eric Rivals
May 7, 2005·Journal of Genetics·Mehmet KaracaSafinaz Y Elmasulu
Oct 6, 2005·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Ydo WexlerDan Geiger
Jan 20, 2007·American Journal of Human Genetics·Birgitta WinnepenninckxR Frank Kooy
Jan 24, 2007·Bioinformatics·Dina SokolJustin Tojeira
Mar 24, 2007·Bioinformatics·Suresh B Mudunuri, Hampapathalu A Nagarajaram
Apr 28, 2007·Bioinformatics·Robert KoflerTamas Lelley
Jun 22, 2007·Nature·Sergei M Mirkin
Feb 22, 2008·Nucleic Acids Research·Surya SahaDaniel G Peterson
Mar 11, 2009·IEEE Transactions on Information Technology in Biomedicine : a Publication of the IEEE Engineering in Medicine and Biology Society·Hongxia ZhouHong Yan
May 30, 2009·Science·Marcelo D VincesKevin J Verstrepen
Nov 3, 2009·American Journal of Human Genetics·Nozomu SatoHidehiro Mizusawa
Dec 18, 2009·BMC Genomics·Trevor J PembertonNoah A Rosenberg
Feb 24, 2010·Nature Reviews. Molecular Cell Biology·Arturo López CastelChristopher E Pearson
Jun 10, 2010·Bioinformatics·Marco PellegriniAlessio Vecchio
Sep 3, 2010·Annual Review of Genetics·Rita GemayelKevin J Verstrepen
Mar 25, 2011·EMBO Molecular Medicine·Sabrina GrubeHannelore Ehrenreich
Apr 24, 2012·Genome Research·Melissa GymrekYaniv Erlich
Jun 1, 2012·Briefings in Bioinformatics·Kian Guan LimAdrianto Wirawan
Oct 5, 2012·Nucleic Acids Research·Hani Z Girgis, Sergey L Sheetlin
Jan 19, 2013·Science·Melissa GymrekYaniv Erlich

❮ Previous
Next ❯

Methods Mentioned

BETA
PCR

Software Mentioned

fastq
Tandem Repeat Finder
tandemSWAN
TRA
TRF
Repeat Masker
SciRoKo
mreps
fasta
reachest

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Blastomycosis

Blastomycosis fungal infections spread through inhaling Blastomyces dermatitidis spores. Discover the latest research on blastomycosis fungal infections here.

Nuclear Pore Complex in ALS/FTD

Alterations in nucleocytoplasmic transport, controlled by the nuclear pore complex, may be involved in the pathomechanism underlying multiple neurodegenerative diseases including Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. Here is the latest research on the nuclear pore complex in ALS and FTD.

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Microbicide

Microbicides are products that can be applied to vaginal or rectal mucosal surfaces with the goal of preventing, or at least significantly reducing, the transmission of sexually transmitted infections. Here is the latest research on microbicides.