Alignment-free detection of local similarity among viral and bacterial genomes

Mirjana Domazet-Lošo, Bernhard Haubold


Bacterial and viral genomes are often affected by horizontal gene transfer observable as abrupt switching in local homology. In addition to the resulting mosaic genome structure, they frequently contain regions not found in close relatives, which may play a role in virulence mechanisms. Due to this connection to medical microbiology, there are numerous methods available to detect horizontal gene transfer. However, these are usually aimed at individual genes and viral genomes rather than the much larger bacterial genomes. Here, we propose an efficient alignment-free approach to describe the mosaic structure of viral and bacterial genomes, including their unique regions. Our method is based on the lengths of exact matches between pairs of sequences. Long matches indicate close homology, short matches more distant homology or none at all. These exact match lengths can be looked up efficiently using an enhanced suffix array. Our program implementing this approach, alfy (ALignment-Free local homologY), efficiently and accurately detects the recombination break points in simulated DNA sequences and among recombinant HIV-1 strains. We also apply alfy to Escherichia coli genomes where we detect new evidence for the hypothesis that stra...Continue Reading


Jan 3, 1991·Nature·J M SmithB G Spratt
Jul 1, 1990·Proceedings of the National Academy of Sciences of the United States of America·S F Altschul, D J Lipman
Apr 1, 1990·Archives of Dermatology·A R Rhodes, M C Mihm
Mar 1, 1970·Journal of Molecular Biology·S B Needleman, C D Wunsch
Mar 25, 1981·Journal of Molecular Biology·T F Smith, Michael S Waterman
Mar 4, 2003·Bioinformatics·Susana Vinga, Jonas Almeida
Jun 25, 2004·Nucleic Acids Research·Mikhail RozanovTatiana Tatusova
May 25, 2005·BMC Bioinformatics·Bernhard HauboldThomas Wiehe
Sep 21, 2005·Proceedings of the National Academy of Sciences of the United States of America·Hervé TettelinClaire M Fraser
Nov 25, 2005·Bioinformatics·Reed A Cartwright
Dec 8, 2006·Genetics·Xavier Didelot, Daniel Falush
Dec 26, 2006·BMC Bioinformatics·Bernhard Haubold, Thomas Wiehe
Mar 21, 2009·PLoS Computational Biology·Oscar Westesson, Ian Holmes
Oct 7, 2009·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Bernhard HauboldThomas Wiehe
Oct 15, 2009·Bioinformatics·Mirjana Domazet-Lošo, Bernhard Haubold
Dec 4, 2009·PLoS Computational Biology·Sergei L Kosakovsky PondSimon D W Frost
Dec 17, 2009·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Gesine ReinertMichael S Waterman
Dec 17, 2009·Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences·Sydney Brenner
Apr 17, 2010·Nature Reviews. Microbiology·Morgan G I LangilleFiona S L Brinkman
Dec 16, 2010·Bioinformatics·Bernhard HauboldPeter Pfaffelhuber


Jul 24, 2013·Journal of Molecular Evolution·Mark A Ragan, Cheong Xin Chan
Feb 7, 2013·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Kai SongFengzhu Sun
Jul 9, 2013·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Ehsan BehnamAndrew D Smith
Sep 26, 2013·Bioinformatics·Bernhard HauboldPeter Pfaffelhuber
Jul 25, 2012·BMC Bioinformatics·Dan WeiShengrui Wang
Jan 24, 2013·Biology Direct·Cheong Xin Chan, Mark A Ragan
Jul 5, 2012·PloS One·Millaray Curilem SaldíasIván Maureira Butler
Oct 8, 2013·Briefings in Bioinformatics·Isabel Schwende, Tuan D Pham
Jul 6, 2014·EURASIP Journal on Bioinformatics & Systems Biology·Brian R KingZach Warres
Aug 2, 2013·Briefings in Bioinformatics·Oliver Bonham-CarterDhundy Bastola
Dec 3, 2014·Computational Biology and Chemistry·Jie TangXiaoli Xie
Oct 1, 2014·Scientific Reports·Cheong Xin ChanMark A Ragan
Feb 17, 2015·Algorithms for Molecular Biology : AMB·Burkhard MorgensternChris André Leimeister
Oct 23, 2015·Journal of Clinical Virology : the Official Publication of the Pan American Society for Clinical Virology·Herjan H J BavelaarWillem J G Melchers
Dec 20, 2016·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Lianping Yang, Weilin Zhang
Jul 28, 2016·Scientific Reports·Yingnan CongMark A Ragan
Mar 10, 2016·Life·Fabian Klötzl, Bernhard Haubold
Nov 16, 2016·PloS One·Mirjana Domazet-Lošo, Tomislav Domazet-Lošo
Jan 21, 2017·F1000Research·Guillaume BernardCheong Xin Chan
Aug 7, 2012·Nature Reviews. Microbiology·Nicholas J LomanMark J Pallen
Jul 28, 2016·Scientific Reports·Yingnan CongMark A Ragan
Feb 15, 2019·Genome Biology·Shahab SarmashghiSiavash Mirarab
Dec 28, 2019·Biomolecules·Guillermin Agüero-ChapinAgostinho Antunes
Oct 5, 2017·Genome Biology·Andrzej ZielezinskiWojciech M Karlowski
Aug 22, 2012·G3 : Genes - Genomes - Genetics·Bernhard Haubold, Peter Pfaffelhuber

Related Concepts

Alkalescens-Dispar Group
Homologous Sequences, Nucleic Acid
Computer Programs and Programming
Viral Genome
Genome, Bacterial
Sequence Determinations, DNA
Recombination, Interspecies
Abruptio Placentae

Trending Feeds


Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Synthetic Genetic Array Analysis

Synthetic genetic arrays allow the systematic examination of genetic interactions. Here is the latest research focusing on synthetic genetic arrays and their analyses.

Congenital Hyperinsulinism

Congenital hyperinsulinism is caused by genetic mutations resulting in excess insulin secretion from beta cells of the pancreas. Here is the latest research.

Neural Activity: Imaging

Imaging of neural activity in vivo has developed rapidly recently with the advancement of fluorescence microscopy, including new applications using miniaturized microscopes (miniscopes). This feed follows the progress in this growing field.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Epigenetic Memory

Epigenetic memory refers to the heritable genetic changes that are not explained by the DNA sequence. Find the latest research on epigenetic memory here.

Cell Atlas of the Human Eye

Constructing a cell atlas of the human eye will require transcriptomic and histologic analysis over the lifespan. This understanding will aid in the study of development and disease. Find the latest research pertaining to the Cell Atlas of the Human Eye here.

Femoral Neoplasms

Femoral Neoplasms are bone tumors that arise in the femur. Discover the latest research on femoral neoplasms here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Related Papers

Bernhard HauboldPeter Pfaffelhuber
G3 : Genes - Genomes - Genetics
Bernhard Haubold, Peter Pfaffelhuber
Mirjana Domazet-Lošo, Bernhard Haubold
Bernhard HauboldPeter Pfaffelhuber
© 2021 Meta ULC. All rights reserved