Evaluating the Evaluation of Cancer Driver Genes

BioRxiv : the Preprint Server for Biology
Collin TokheimRachel Karchin


Sequencing has identified millions of somatic mutations in human cancers, but distinguishing cancer driver genes remains a major challenge. Numerous methods have been developed to identify driver genes, but evaluation of the performance of these methods is hindered by the lack of a gold standard, i.e., bona fide driver gene mutations. Here, we establish an evaluation framework that can be applied when a gold standard is not available. We used this framework to compare the performance of eight driver gene prediction methods. One of these methods, newly described here, incorporated a machine learning-based ratiometric approach. We show that the driver genes predicted by each of these eight methods vary widely. Moreover, the p-values reported by several of the methods were inconsistent with the uniform values expected, thus calling into question the assumptions that were used to generate them. Finally, we evaluated the potential effects of unexplained variability in mutation rates on false positive driver gene predictions. Our analysis points to the strengths and weaknesses of each of the currently available methods and offers guidance for improving them in the future.

Related Concepts

Somatic Mutation
Nucleic Acid Sequencing
Gene Mutation
Malignant Neoplasms

Related Feeds

BioRxiv & MedRxiv Preprints

BioRxiv and MedRxiv are the preprint servers for biology and health sciences respectively, operated by Cold Spring Harbor Laboratory. Here are the latest preprint articles (which are not peer-reviewed) from BioRxiv and MedRxiv.

Cancer Genomics (Preprints)

Cancer genomics employ high-throughput technologies to identify the complete catalog of somatic alterations that characterize the genome, transcriptome and epigenome of cohorts of tumor samples. Discover the latest preprints here.