Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors

Statistics in Medicine
F E HarrellD B Mark


Multivariable regression models are powerful tools that are used frequently in studies of clinical outcomes. These models can use a mixture of categorical and continuous variables and can handle partially observed (censored) responses. However, uncritical application of modelling techniques can result in models that poorly fit the dataset at hand, or, even more likely, inaccurately predict outcomes on new subjects. One must know how to measure qualities of a model's fit in order to avoid poorly fitted or overfitted models. Measurement of predictive accuracy can be difficult for survival time data in the presence of censoring. We discuss an easily interpretable index of predictive discrimination as well as methods for assessing calibration of predicted survival probabilities. Both types of predictive accuracy should be unbiasedly validated using bootstrapping or cross-validation, before using predictions in a new data series. We discuss some of the hazards of poorly fitted and overfitted regression models and present one modelling strategy that avoids many of the problems discussed. The methods described are applicable to all regression models, but are particularly needed for binary, ordinal, and time-to-event outcomes. Methods ...Continue Reading


Nov 16, 2013·Virchows Archiv : an International Journal of Pathology·Lucia Veronica CuorvoMattia Barbareschi
Apr 8, 2014·PloS One·Joseph KagaayiMendel E Singer
May 3, 2014·BMJ : British Medical Journal·Rachel CooperDiana Kuh
Jun 17, 2014·Bioinformatics·Christoph BernauLorenzo Trippa
Jun 26, 2013·Osteoporosis International : a Journal Established As Result of Cooperation Between the European Foundation for Osteoporosis and the National Osteoporosis Foundation of the USA·W D Leslie, L M Lix
Jan 31, 2014·Gastric Cancer : Official Journal of the International Gastric Cancer Association and the Japanese Gastric Cancer Association·Mattia AltiniOriana Nanni
Dec 3, 2014·Journal of Psychosomatic Research·Janna M GolJudith G M Rosmalen
Nov 14, 2001·Lancet·C CounsellS Lewis
Jun 20, 2014·Pediatric Nephrology : Journal of the International Pediatric Nephrology Association·Isabel G QuirinoEduardo A Oliveira
Feb 27, 2014·Intensive Care Medicine·José LabarèreMichael J Fine
Mar 8, 2014·American Journal of Otolaryngology·Nima KhavaninSandeep Samant
Dec 23, 2003·European Journal of Cancer : Official Journal for European Organization for Research and Treatment of Cancer (EORTC) [and] European Association for Cancer Research (EACR)·M Holten-AndersenEORTC-Receptor and Biomarker Group
Feb 14, 2012·Journal of Crohn's & Colitis·Bo ShenMichael W Kattan
May 2, 2013·The Journal of Orthopaedic and Sports Physical Therapy·Robert J NeeMichel W Coppieters
Nov 11, 2009·Radiographics : a Review Publication of the Radiological Society of North America, Inc·Turgay AyerElizabeth S Burnside
Apr 18, 2013·American Journal of Respiratory and Critical Care Medicine·Ane JohannessenPer Bakke
Jan 30, 2014·American Journal of Respiratory and Critical Care Medicine·David Jiménez On Behalf Of The Protect Investigators
Nov 30, 2013·Journal of Neurosurgery·John H SampsonDouglas Kondziolka
Dec 25, 2013·American Journal of Epidemiology·Lisa PennellsEmerging Risk Factors Collaboration
Mar 1, 2014·BMC Pediatrics·Sakda Arj-ong VallipakornAmmarin Thakkinstian
May 22, 2013·The Journals of Gerontology. Series A, Biological Sciences and Medical Sciences·Ravi VaradhanJeremy Walston
Jun 3, 2014·BMC Cancer·Vassiliki L TsikitisPatricia A Thompson
Jan 25, 2014·AJNR. American Journal of Neuroradiology·A HilarioA Ramos
Jan 10, 2014·European Journal of Cardio-thoracic Surgery : Official Journal of the European Association for Cardio-thoracic Surgery·Fabio BariliAlessandro Parolari

Related Concepts

In Silico
Prostatic Neoplasms
Log-Linear Models
Computer Graphics
Survival Analysis
Prostate Carcinoma
Computer Programs and Programming
Statistical Programs, Computer Based
Data Interpretation, Statistical
Multivariate Analysis

Trending Feeds


Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Sexual Dimorphism in Neurodegeneration

There exist sex differences in neurodevelopmental and neurodegenerative disorders. For instance, multiple sclerosis is more common in women, whereas Parkinson’s disease is more common in men. Here is the latest research on sexual dimorphism in neurodegeneration

HLA Genetic Variation

HLA genetic variation has been found to confer risk for a wide variety of diseases. Identifying these associations and understanding their molecular mechanisms is ongoing and holds promise for the development of therapeutics. Find the latest research on HLA genetic variation here.

Super-resolution Microscopy

Super-resolution microscopy is the term commonly given to fluorescence microscopy techniques with resolutions that are not limited by the diffraction of light. Here are the latest discoveries pertaining to super-resolution microscopy.

Genetic Screens in iPSC-derived Brain Cells

Genetic screening is a critical tool that can be employed to define and understand gene function and interaction. This feed focuses on genetic screens conducted using induced pluripotent stem cell (iPSC)-derived brain cells.

Brain Lower Grade Glioma

Low grade gliomas in the brain form from oligodendrocytes and astrocytes and are the slowest-growing glioma in adults. Discover the latest research on these brain tumors here.

CD4/CD8 Signaling

Cluster of differentiation 4 and 8 (CD8 and CD8) are glycoproteins founds on the surface of immune cells. Here is the latest research on their role in cell signaling pathways.

Alignment-free Sequence Analysis Tools

Alignment-free sequence analyses have been applied to problems ranging from whole-genome phylogeny to the classification of protein families, identification of horizontally transferred genes, and detection of recombined sequences. Here is the latest research.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.