May 31, 2015

Improved Algorithms for Finding Edit Distance Based Motifs

BioRxiv : the Preprint Server for Biology
Soumitra Pal, Sanguthevar Rajasekaran

Abstract

Motif search is an important step in extracting meaningful patterns from biological data. Since the general problem of motif search is intractable, there is a pressing need to develop efficient exact and approximation algorithms to solve this problem. We design novel algorithms for solving the Edit-distance-based Motif Search (EMS) problem: given two integers l,d and n biological strings, find all strings of length l that appear in each input strings with at most d substitutions, insertions and deletions. These algorithms have been evaluated on several challenging instances. Our algorithm solves a moderately hard instance (11,3) in a couple of minutes and the next difficult instance (14,3) in a couple of hours whereas the best previously known algorithm, EMS1, solves (11,3) in a few hours and does not solve (13,4) even after 3 days. This significant improvement is due to a novel and provably efficient neighborhood generation technique introduced in this paper. This efficient approach can be used in other edit distance based applications in Bioinformatics, such as k-spectrum based sequence error correction algorithms. We also use a trie based data structure to efficiently store the candidate motifs in the neighbourhood and to ou...Continue Reading

  • References
  • Citations

References

  • We're still populating references for this paper, please check back later.
  • References
  • Citations

Citations

  • This paper may not have been cited yet.

Mentioned in this Paper

Patterns
Gene Deletion
Bio-Informatics
Protein K
CTTN
CTTN gene
Emergency Medical Service
Protein Domain
Structure
CTTN wt Allele

About this Paper

Related Feeds

BioRxiv & MedRxiv Preprints

BioRxiv and MedRxiv are the preprint servers for biology and health sciences respectively, operated by Cold Spring Harbor Laboratory. Here are the latest preprint articles (which are not peer-reviewed) from BioRxiv and MedRxiv.

Bioinformatics in Biomedicine (Preprints)

Bioinformatics in biomedicine incorporates computer science, biology, chemistry, medicine, mathematics and statistics. Discover the latest preprints on bioinformatics in biomedicine here.

Related Papers

Journal of Computational Biology : a Journal of Computational Molecular Cell Biology
Anas Al-Okaily, Chun-Hsi Huang
IEEE/ACM Transactions on Computational Biology and Bioinformatics
Xingyu CaiSanguthevar Rajasekaran
© 2020 Meta ULC. All rights reserved