Cross-validation of component models: a critical look at current methods

Analytical and Bioanalytical Chemistry
R BroHenk A L Kiers

Abstract

In regression, cross-validation is an effective and popular approach that is used to decide, for example, the number of underlying features, and to estimate the average prediction error. The basic principle of cross-validation is to leave out part of the data, build a model, and then predict the left-out samples. While such an approach can also be envisioned for component models such as principal component analysis (PCA), most current implementations do not comply with the essential requirement that the predictions should be independent of the entity being predicted. Further, these methods have not been properly reviewed in the literature. In this paper, we review the most commonly used generic PCA cross-validation schemes and assess how well they work in various scenarios.

Citations

Jul 31, 2008·Journal of Proteome Research·Guangxu JinStephen T C Wong
Apr 20, 2011·Journal of Environmental Monitoring : JEM·Martin M LarsenOle Andersen
Jul 20, 2010·Food Additives & Contaminants. Part A, Chemistry, Analysis, Control, Exposure & Risk Assessment·L Duedahl-OlesenM Timm-Heinrich
Feb 1, 2003·The Journal of the Acoustical Society of America·Yanli ZhengShamala Pizza
Jun 3, 2014·BMC Systems Biology·Dicle HasdemirAge K Smilde
Dec 6, 2012·Journal of Pharmaceutical Sciences·Jian X WuJukka Rantanen
Feb 22, 2011·International Journal of Pharmaceutics·Tarja Rajalahti, Olav M Kvalheim
Mar 13, 2016·Journal of Pharmaceutical and Biomedical Analysis·Slavica FilipicDanica Agbaba
Nov 5, 2015·Journal of Hazardous Materials·Anna RybinskaTomasz Puzyn
Jul 29, 2010·Molecular Systems Biology·Jasmina SaricElaine Holmes
Nov 12, 2009·Phytochemical Analysis : PCA·Jeroen J JansenAge K Smilde
Feb 19, 2015·Journal of Tissue Engineering and Regenerative Medicine·Rodrigo A SomozaCaroline Weinstein-Oppenheimer
Apr 4, 2015·Journal of Neuroscience Methods·Fengyu CongTapani Ristaniemi
Mar 19, 2013·Journal of Pharmaceutical and Biomedical Analysis·Xingfang SongLingyi Kong
Feb 27, 2016·Chemphyschem : a European Journal of Chemical Physics and Physical Chemistry·Anita SosnowskaTomasz Puzyn
Nov 1, 2018·American Journal of Physiology. Renal Physiology·Jose L Izquierdo-GarciaJosé A Lorente
Jan 19, 2019·Briefings in Bioinformatics·Yipeng SongAge K Smilde
Nov 21, 2018·Multivariate Behavioral Research·Kirsten BulteelEva Ceulemans
Nov 17, 2017·Current Medicinal Chemistry·Alicja Kotłowska, Piotr Szefer
Jun 1, 2019·PLoS Computational Biology·Audrey J SederbergGarrett B Stanley
Jun 5, 2019·Nature Communications·C MehringE Burdet
Sep 15, 2017·Journal of Toxicology and Environmental Health. Part a·Zhanna TairovaEva C Bonefeld-Jørgensen
Jan 1, 2018·Zeitschrift Für Psychologie·Niek C de Schipper, Katrijn Van Deun

❮ Previous
Next ❯

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Blastomycosis

Blastomycosis fungal infections spread through inhaling Blastomyces dermatitidis spores. Discover the latest research on blastomycosis fungal infections here.

Nuclear Pore Complex in ALS/FTD

Alterations in nucleocytoplasmic transport, controlled by the nuclear pore complex, may be involved in the pathomechanism underlying multiple neurodegenerative diseases including Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. Here is the latest research on the nuclear pore complex in ALS and FTD.

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Microbicide

Microbicides are products that can be applied to vaginal or rectal mucosal surfaces with the goal of preventing, or at least significantly reducing, the transmission of sexually transmitted infections. Here is the latest research on microbicides.