Predicting protein function and other biomedical characteristics with heterogeneous ensembles

Methods : a Companion to Methods in Enzymology
Sean WhalenGaurav Pandey

Abstract

Prediction problems in biomedical sciences, including protein function prediction (PFP), are generally quite difficult. This is due in part to incomplete knowledge of the cellular phenomenon of interest, the appropriateness and data quality of the variables and measurements used for prediction, as well as a lack of consensus regarding the ideal predictor for specific problems. In such scenarios, a powerful approach to improving prediction performance is to construct heterogeneous ensemble predictors that combine the output of diverse individual predictors that capture complementary aspects of the problems and/or datasets. In this paper, we demonstrate the potential of such heterogeneous ensembles, derived from stacking and ensemble selection methods, for addressing PFP and other similar biomedical prediction problems. Deeper analysis of these results shows that the superior predictive ability of these methods, especially stacking, can be attributed to their attention to the following aspects of the ensemble learning process: (i) better balance of diversity and performance, (ii) more effective calibration of outputs and (iii) more robust incorporation of additional base predictors. Finally, to make the effective application of h...Continue Reading

References

Aug 10, 2000·Cell·T R HughesS H Friend
Mar 10, 2001·Science·J L HartmanL Hartwell
May 16, 2002·Proceedings of the National Academy of Sciences of the United States of America·Robert TibshiraniGilbert Chu
May 7, 2005·Nature Biotechnology·Ryan Kelley, Trey Ideker
Mar 14, 2007·Molecular Systems Biology·Roded SharanRon Shamir
Jul 22, 2008·Genome Biology·Lourdes Peña-CastilloFrederick P Roth
Jul 22, 2008·Genome Biology·Yuanfang GuanOlga G Troyanskaya
Jan 23, 2010·Science·Michael CostanzoCharles Boone
Sep 15, 2010·PLoS Computational Biology·Gaurav PandeyEric E Schadt
Jan 25, 2012·NeuroImage·Manhua LiuUNKNOWN Alzheimer's Disease Neuroimaging Initiative
Jul 5, 2012·BMC Bioinformatics·Jhih-Rong LinJianjun Hu
Jan 29, 2013·Nature Methods·Predrag RadivojacIddo Friedberg
May 1, 2013·International Journal of Cancer. Journal International Du Cancer·Ewa SzczurekMartin Vingron
Dec 18, 2013·IEEE/ACM Transactions on Computational Biology and Bioinformatics·Guoxian YuZhiwen Yu
Jan 2, 2014·Frontiers in Genetics·Benjamin Boucher, Sarah Jenna
Oct 16, 2014·Genome Biology·Paul C BoutrosGustavo Stolovitzky

❮ Previous
Next ❯

Citations

Jan 19, 2016·Methods : a Companion to Methods in Enzymology·Daisuke Kihara
Nov 19, 2015·Frontiers in Bioengineering and Biotechnology·Neel S MadhukarGaurav Pandey
Nov 20, 2018·F1000Research·Linhua WangGaurav Pandey
Oct 22, 2020·Metabolomics : Official Journal of the Metabolomic Society·Kelsey ChetnikGaurav Pandey
Feb 14, 2021·BioData Mining·Gianluca Moro, Marco Masseroli

❮ Previous
Next ❯

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Blastomycosis

Blastomycosis fungal infections spread through inhaling Blastomyces dermatitidis spores. Discover the latest research on blastomycosis fungal infections here.

Nuclear Pore Complex in ALS/FTD

Alterations in nucleocytoplasmic transport, controlled by the nuclear pore complex, may be involved in the pathomechanism underlying multiple neurodegenerative diseases including Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. Here is the latest research on the nuclear pore complex in ALS and FTD.

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Microbicide

Microbicides are products that can be applied to vaginal or rectal mucosal surfaces with the goal of preventing, or at least significantly reducing, the transmission of sexually transmitted infections. Here is the latest research on microbicides.