Nov 2, 2018

A machine-learning approach for accurate detection of copy-number variants from exome sequencing

BioRxiv : the Preprint Server for Biology
Vijay KumarSanthosh Girirajan

Abstract

Copy-number variants (CNVs) are a major cause of several genetic disorders, making their detection an essential component of genetic analysis pipelines. Current methods for detecting CNVs from exome sequencing data are limited by high false positive rates and low concordance due to the inherent biases of individual algorithms. To overcome these issues, calls generated by two or more algorithms are often intersected using Venn-diagram approaches to identify “high-confidence” CNVs. However, this approach is inadequate, as it misses potentially true calls that do not have consensus from multiple callers. Here, we present CN-Learn, a machine-learning framework (<https://github.com/girirajanlab/CN_Learn>) that integrates calls from multiple CNV detection algorithms and learns to accurately identify true CNVs using caller-specific and genomic features from a small subset of validated CNVs. Using CNVs predicted by four exome-based CNV callers (CANOES, CODEX, XHMM and CLAMMS) from 503 samples, we demonstrate that CN-Learn identifies true CNVs at higher precision (~90%) and recall (~85%) rates while maintaining robust performance even when trained with minimal data (~30 samples). CN-Learn recovers twice as many CNVs compared to individu...Continue Reading

  • References
  • Citations

References

  • We're still populating references for this paper, please check back later.
  • References
  • Citations

Citations

  • This paper may not have been cited yet.

Mentioned in this Paper

Study
Level 3 Hospital Emergency Department Visit Provided in a Type B Emergency Department; (the Ed Must Meet at Least One of the Following Requirements: (1) It is Licensed by the State in Which It is Located Under Applicable State Law as an Emergency Room or E
Genetic Disorders Screening
Genome
Malignant Neoplasm of Stomach
Genetic Analysis
Genetic Screening Method
Choroidal Neovascularization
Whole Exome Sequencing
Genomics

About this Paper

Related Feeds

BioRxiv & MedRxiv Preprints

BioRxiv and MedRxiv are the preprint servers for biology and health sciences respectively, operated by Cold Spring Harbor Laboratory. Here are the latest preprint articles (which are not peer-reviewed) from BioRxiv and MedRxiv.