Ranking metrics in gene set enrichment analysis: do they matter?

BMC Bioinformatics
Joanna ZylaJoanna Polanska

Abstract

There exist many methods for describing the complex relation between changes of gene expression in molecular pathways or gene ontologies under different experimental conditions. Among them, Gene Set Enrichment Analysis seems to be one of the most commonly used (over 10,000 citations). An important parameter, which could affect the final result, is the choice of a metric for the ranking of genes. Applying a default ranking metric may lead to poor results. In this work 28 benchmark data sets were used to evaluate the sensitivity and false positive rate of gene set analysis for 16 different ranking metrics including new proposals. Furthermore, the robustness of the chosen methods to sample size was tested. Using k-means clustering algorithm a group of four metrics with the highest performance in terms of overall sensitivity, overall false positive rate and computational load was established i.e. absolute value of Moderated Welch Test statistic, Minimum Significant Difference, absolute value of Signal-To-Noise ratio and Baumgartner-Weiss-Schindler test statistic. In case of false positive rate estimation, all selected ranking metrics were robust with respect to sample size. In case of sensitivity, the absolute value of Moderated We...Continue Reading

References

Jul 3, 1999·Nature Genetics·S TavazoieG M Church
Dec 26, 2001·Nucleic Acids Research·Ron EdgarAlex E Lash
Sep 15, 2005·BMC Bioinformatics·John TomfohrThomas B Kepler
Oct 4, 2005·Proceedings of the National Academy of Sciences of the United States of America·Aravind SubramanianJill P Mesirov
May 2, 2006·Statistical Applications in Genetics and Molecular Biology·Jörg RahnenführerThomas Lengauer
Nov 14, 2006·Bioinformatics·S Falcon, R Gentleman
Jul 24, 2007·Bioinformatics·Aravind SubramanianJill P Mesirov
Mar 18, 2008·Bioinformatics·Meaza DemissieYudi Pawitan
Jun 27, 2008·Algorithms for Molecular Biology : AMB·Koji KadotaKentaro Shimizu
Jan 27, 2009·Proceedings of the National Academy of Sciences of the United States of America·Frank SchembriAvrum Spira
Feb 5, 2009·BMC Bioinformatics·Marit Ackermann, Korbinian Strimmer
Dec 31, 2009·PLoS Genetics·Jacques FellayUNKNOWN NIAID Center for HIV/AIDS Vaccine Immunology (CHAVI)
Jul 6, 2010·Statistical Applications in Genetics and Molecular Biology·Ali Shojaie, George Michailidis
Nov 26, 2010·The Journal of Heart and Lung Transplantation : the Official Publication of the International Society for Heart Transplantation·Chang Hyun KangShaf Keshavjee
Jun 28, 2011·Nature Medicine·Thordur OskarssonJoan Massagué
Jul 8, 2011·Nucleic Acids Research·Aleksandra GrucaAndrzej Polanski
May 29, 2012·Nucleic Acids Research·Di Wu, Gordon K Smyth
Jun 21, 2012·BMC Bioinformatics·Adi Laurentiu TarcaRoberto Romero
Feb 16, 2013·Briefings in Bioinformatics·Henryk Maciejewski
Jan 16, 2014·BMC Bioinformatics·Doulaye Dembélé, Philippe Kastner
Jul 23, 2015·Briefings in Bioinformatics·Maria K Jaakkola, Laura L Elo
Oct 18, 2015·Nucleic Acids Research·Minoru KanehisaMao Tanabe
Sep 25, 2016·BMC Bioinformatics·Christian HundtBertil Schmidt

❮ Previous
Next ❯

Citations

Dec 21, 2019·Science·Pedro Moura-AlvesStefan H E Kaufmann
Feb 7, 2020·Briefings in Bioinformatics·Ludwig GeistlingerLevi Waldron
Jul 23, 2020·Frontiers in Genetics·Farhad MalekiAnthony J Kusalik
May 19, 2019·BMC Bioinformatics·Joanna RoderCarlos Oliveira
Dec 9, 2020·BMC Bioinformatics·Sebastian Canzler, Jörg Hackermüller
Jan 28, 2021·Molecular Systems Biology·Aurelien DugourdJulio Saez-Rodriguez
Sep 23, 2020·Journal of Biomedical Informatics·Mohammadreza MomenzadehHossein Rabbani
Nov 19, 2021·G3 : Genes - Genomes - Genetics·Chi-Hsuan HoChuhsing Kate Hsiao

❮ Previous
Next ❯

Datasets Mentioned

BETA
GSE11024
GSE1145
GSE4183

Methods Mentioned

BETA
FCS

Software Mentioned

cudaGSEA
EnrichmentBrowser
Java
DAVID
Parallel Computing Toolbox
CePa
GSEA4GWAS
SNP
RuleGO
ompGSEA

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Blastomycosis

Blastomycosis fungal infections spread through inhaling Blastomyces dermatitidis spores. Discover the latest research on blastomycosis fungal infections here.

Nuclear Pore Complex in ALS/FTD

Alterations in nucleocytoplasmic transport, controlled by the nuclear pore complex, may be involved in the pathomechanism underlying multiple neurodegenerative diseases including Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. Here is the latest research on the nuclear pore complex in ALS and FTD.

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Microbicide

Microbicides are products that can be applied to vaginal or rectal mucosal surfaces with the goal of preventing, or at least significantly reducing, the transmission of sexually transmitted infections. Here is the latest research on microbicides.