Nov 6, 2015

chopBAI: BAM index reduction solves I/O bottlenecks in the joint analysis of large sequencing cohorts

BioRxiv : the Preprint Server for Biology
Birte Kehr, Páll Melsted


Summary: Advances in sequencing capacity have lead to the generation of unprecedented amounts of genomic data. The processing of this data frequently leads to I/O bottlenecks, e.g. when analyzing a small genomic region across a large number of samples. The largest I/O burden is, however, often not imposed by the amount of data needed for the analysis but rather by index files that help retrieving this data. We have developed chopBAI, a program that can chop a BAM index (BAI) file into small pieces. The program outputs a list of BAI files each indexing a specified genomic interval. The output files are much smaller in size but maintain compatibility with existing software tools. We show how preprocessing BAI files with chopBAI can lead to a reduction of I/O by more than 95 % during the analysis of 10 Kbp genomic regions, eventually enabling the joint analysis of more than 10,000 individuals. Availability and Implementation: The software is implemented in C++, GPL licensed and available at Contact:

  • References
  • Citations


  • We're still populating references for this paper, please check back later.
  • References
  • Citations


  • This paper may not have been cited yet.

Mentioned in this Paper

Computer Software
Nucleic Acid Sequencing
DDIT3 wt Allele
Toe Brachial Index
DDIT3 gene

About this Paper

Related Feeds

BioRxiv & MedRxiv Preprints

BioRxiv and MedRxiv are the preprint servers for biology and health sciences respectively, operated by Cold Spring Harbor Laboratory. Here are the latest preprint articles (which are not peer-reviewed) from BioRxiv and MedRxiv.