Mar 11, 2016

Exploratory analysis and error modeling of a sequencing technology

BioRxiv : the Preprint Server for Biology
Michael InouyeTaane G Clark

Abstract

Next generation DNA sequencing methods have created an unprecedented leap in sequence data generation, thus novel computational tools and statistical models are required to optimize and assess the resulting data. In this report, we explore underlying causes of error for the Illumina Genome Analyzer (IGA) sequencing technology and attempt to quantify their effects using a human bacterial artificial chromosome sequenced to 60,000 fold coverage. Seven potential error predictors are considered: Phred score, read entropy, tile coordinates, local tile density, base position within read, nucleotide call, and lane. With these parameters, logistic regression and log-linear models are constructed and used to show that each of the potential predictors contributes to error (P<1x10-4). With this additional information, we apply the logistic model and achieve a 3% improvement in both the sensitivity and specificity to detect IGA errors. Further, we demonstrate that these modeling approaches can be used as a feedback loop to inform laboratory methods and identify specific machine or run bias.

  • References
  • Citations

References

  • We're still populating references for this paper, please check back later.
  • References
  • Citations

Citations

  • This paper may not have been cited yet.

Mentioned in this Paper

Computer Software
Positioning Attribute
Laboratory Procedures
Logistic Regression
Bacterial Artificial Chromosomes
Sequencing
Massively-Parallel Sequencing
Genomic DNA
Sequence Determinations, DNA
Nucleotides

About this Paper

Related Feeds

BioRxiv & MedRxiv Preprints

BioRxiv and MedRxiv are the preprint servers for biology and health sciences respectively, operated by Cold Spring Harbor Laboratory. Here are the latest preprint articles (which are not peer-reviewed) from BioRxiv and MedRxiv.

Artificial Chromosomes

Artificial chromosomes are genetically engineered chromosomes derived from the DNA of a species. Discover the latest research on artificial chromosomes here.