May 27, 2014

Lighter: fast and memory-efficient error correction without counting

BioRxiv : the Preprint Server for Biology
Li SongBen Langmead

Abstract

Lighter is a fast and memory-efficient tool for correcting sequencing errors in high-throughput sequencing datasets. Lighter avoids counting k -mers in the sequencing reads. Instead, it uses a pair of Bloom filters, one populated with a sample of the input k -mers and the other populated with k -mers likely to be correct based on a simple test. As long as the sampling fraction is adjusted in inverse proportion to the depth of sequencing, the Bloom filter size can be held constant while maintaining near-constant accuracy. Lighter is easily applied to very large sequencing datasets. It is parallelized, uses no secondary storage, and is both faster and more memory-efficient than competing approaches while achieving comparable accuracy. Lighter is free open source software available from <https://github.com/mourisl/Lighter/>.

  • References
  • Citations

References

  • We're still populating references for this paper, please check back later.
  • References
  • Citations

Citations

  • This paper may not have been cited yet.

Mentioned in this Paper

Computer Software
Severe Acute Respiratory Syndrome
Size
Nucleic Acid Sequencing
Sequencing
Filter - Medical Device
High-Throughput RNA Sequencing

Related Feeds

BioRxiv & MedRxiv Preprints

BioRxiv and MedRxiv are the preprint servers for biology and health sciences respectively, operated by Cold Spring Harbor Laboratory. Here are the latest preprint articles (which are not peer-reviewed) from BioRxiv and MedRxiv.