Lighter: fast and memory-efficient error correction without counting

BioRxiv : the Preprint Server for Biology
Li SongBen Langmead


Lighter is a fast and memory-efficient tool for correcting sequencing errors in high-throughput sequencing datasets. Lighter avoids counting k -mers in the sequencing reads. Instead, it uses a pair of Bloom filters, one populated with a sample of the input k -mers and the other populated with k -mers likely to be correct based on a simple test. As long as the sampling fraction is adjusted in inverse proportion to the depth of sequencing, the Bloom filter size can be held constant while maintaining near-constant accuracy. Lighter is easily applied to very large sequencing datasets. It is parallelized, uses no secondary storage, and is both faster and more memory-efficient than competing approaches while achieving comparable accuracy. Lighter is free open source software available from <>.

Related Concepts

Computer Software
Nucleic Acid Sequencing
Filter - Medical Device
High-Throughput RNA Sequencing

Related Feeds

BioRxiv & MedRxiv Preprints

BioRxiv and MedRxiv are the preprint servers for biology and health sciences respectively, operated by Cold Spring Harbor Laboratory. Here are the latest preprint articles (which are not peer-reviewed) from BioRxiv and MedRxiv.