Dec 6, 2017

Succinct De Bruijn Graph Construction for Massive Populations Through Space-Efficient Merging

BioRxiv : the Preprint Server for Biology
Martin D Muggli, Christina Boucher


Recently, there has been significant amount of effort in developing space-efficient and succinct data structures for storing and building the traditional de Bruijn graph and its variants, including the colored de Bruijn graph. However, a problem not yet considered is developing a means to merge succinct representations of the de Bruijn graph\---|a challenge is necessary for constructing the de Bruijn graph on very-large datasets. We create VARIMERGE, for building the colored de Bruijn graph on a very-large dataset through partitioning the data into smaller subsets, building the colored de Bruijn graph using a FM-index based representation, and merging these representations in an iterative format. This last step is an algorithmic challenge for which we present an algorithm in this paper. Lastly, we demonstrate the utility of VARIMERGE by demonstrating: a four-fold reduction in working space when constructing an 8,000 color dataset, and the construction of population graph two orders of magnitude larger than previous reported methods.

  • References
  • Citations


  • We're still populating references for this paper, please check back later.
  • References
  • Citations


  • This paper may not have been cited yet.

Mentioned in this Paper

Anatomical Space Structure
T-Lymphocyte Subsets
Zaglossus bruijni
Silo (Dataset)
Population Group

Related Feeds

BioRxiv & MedRxiv Preprints

BioRxiv and MedRxiv are the preprint servers for biology and health sciences respectively, operated by Cold Spring Harbor Laboratory. Here are the latest preprint articles (which are not peer-reviewed) from BioRxiv and MedRxiv.

Related Papers

Martin D MuggliChristina Boucher
BioRxiv : the Preprint Server for Biology
M. M. BalasJ. T. Roberts
Algorithms for Molecular Biology : AMB
Rayan Chikhi, Guillaume Rizk
© 2020 Meta ULC. All rights reserved