Minimum sample size for external validation of a clinical prediction model with a binary outcome

Richard D Riley; Thomas P A Debray; Gary S Collins; Lucinda Archer; Joie Ensor; Maarten van Smeden; Kym I E Snell

doi:10.1002/sim.9025

Minimum sample size for external validation of a clinical prediction model with a binary outcome

Richard D Riley^*, Thomas P A Debray, Gary S Collins, Lucinda Archer, Joie Ensor, Maarten van Smeden, Kym I E Snell

^*Corresponding author for this work

Applied Health Research

Research output: Contribution to journal › Article › peer-review

33 Downloads (Pure)

Abstract

In prediction model research, external validation is needed to examine an existing model's performance using data independent to that for model development. Current external validation studies often suffer from small sample sizes and consequently imprecise predictive performance estimates. To address this, we propose how to determine the minimum sample size needed for a new external validation study of a prediction model for a binary outcome. Our calculations aim to precisely estimate calibration (Observed/Expected and calibration slope), discrimination (C-statistic), and clinical utility (net benefit). For each measure, we propose closed-form and iterative solutions for calculating the minimum sample size required. These require specifying: (i) target SEs (confidence interval widths) for each estimate of interest, (ii) the anticipated outcome event proportion in the validation population, (iii) the prediction model's anticipated (mis)calibration and variance of linear predictor values in the validation population, and (iv) potential risk thresholds for clinical decision-making. The calculations can also be used to inform whether the sample size of an existing (already collected) dataset is adequate for external validation. We illustrate our proposal for external validation of a prediction model for mechanical heart valve failure with an expected outcome event proportion of 0.018. Calculations suggest at least 9835 participants (177 events) are required to precisely estimate the calibration and discrimination measures, with this number driven by the calibration slope criterion, which we anticipate will often be the case. Also, 6443 participants (116 events) are required to precisely estimate net benefit at a risk threshold of 8%. Software code is provided.

Original language	English
Pages (from-to)	4230-4251
Number of pages	22
Journal	Statistics in Medicine
Volume	40
Issue number	19
Early online date	24 May 2021
DOIs	https://doi.org/10.1002/sim.9025
Publication status	Published - 30 Aug 2021

Bibliographical note

Keywords

binary outcomes
calibration
discrimination
external validation
minimum sample size
multivariable prediction model
net benefit

Access to Document

10.1002/sim.9025Licence: Creative Commons: Attribution (CC BY)

RileyR2021MinimumFinal published version, 2.16 MBLicence: Creative Commons: Attribution (CC BY)

Cite this

@article{53be8009448a43648525f88472a6506b,

title = "Minimum sample size for external validation of a clinical prediction model with a binary outcome",

abstract = "In prediction model research, external validation is needed to examine an existing model's performance using data independent to that for model development. Current external validation studies often suffer from small sample sizes and consequently imprecise predictive performance estimates. To address this, we propose how to determine the minimum sample size needed for a new external validation study of a prediction model for a binary outcome. Our calculations aim to precisely estimate calibration (Observed/Expected and calibration slope), discrimination (C-statistic), and clinical utility (net benefit). For each measure, we propose closed-form and iterative solutions for calculating the minimum sample size required. These require specifying: (i) target SEs (confidence interval widths) for each estimate of interest, (ii) the anticipated outcome event proportion in the validation population, (iii) the prediction model's anticipated (mis)calibration and variance of linear predictor values in the validation population, and (iv) potential risk thresholds for clinical decision-making. The calculations can also be used to inform whether the sample size of an existing (already collected) dataset is adequate for external validation. We illustrate our proposal for external validation of a prediction model for mechanical heart valve failure with an expected outcome event proportion of 0.018. Calculations suggest at least 9835 participants (177 events) are required to precisely estimate the calibration and discrimination measures, with this number driven by the calibration slope criterion, which we anticipate will often be the case. Also, 6443 participants (116 events) are required to precisely estimate net benefit at a risk threshold of 8%. Software code is provided.",

keywords = "binary outcomes, calibration, discrimination, external validation, minimum sample size, multivariable prediction model, net benefit",

author = "Riley, {Richard D} and Debray, {Thomas P A} and Collins, {Gary S} and Lucinda Archer and Joie Ensor and {van Smeden}, Maarten and Snell, {Kym I E}",

note = "{\textcopyright} 2021 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.",

year = "2021",

month = aug,

day = "30",

doi = "10.1002/sim.9025",

language = "English",

volume = "40",

pages = "4230--4251",

journal = "Statistics in Medicine",

issn = "0277-6715",

publisher = "Wiley",

number = "19",

}

TY - JOUR

T1 - Minimum sample size for external validation of a clinical prediction model with a binary outcome

AU - Riley, Richard D

AU - Debray, Thomas P A

AU - Collins, Gary S

AU - Archer, Lucinda

AU - Ensor, Joie

AU - van Smeden, Maarten

AU - Snell, Kym I E

PY - 2021/8/30

Y1 - 2021/8/30

N2 - In prediction model research, external validation is needed to examine an existing model's performance using data independent to that for model development. Current external validation studies often suffer from small sample sizes and consequently imprecise predictive performance estimates. To address this, we propose how to determine the minimum sample size needed for a new external validation study of a prediction model for a binary outcome. Our calculations aim to precisely estimate calibration (Observed/Expected and calibration slope), discrimination (C-statistic), and clinical utility (net benefit). For each measure, we propose closed-form and iterative solutions for calculating the minimum sample size required. These require specifying: (i) target SEs (confidence interval widths) for each estimate of interest, (ii) the anticipated outcome event proportion in the validation population, (iii) the prediction model's anticipated (mis)calibration and variance of linear predictor values in the validation population, and (iv) potential risk thresholds for clinical decision-making. The calculations can also be used to inform whether the sample size of an existing (already collected) dataset is adequate for external validation. We illustrate our proposal for external validation of a prediction model for mechanical heart valve failure with an expected outcome event proportion of 0.018. Calculations suggest at least 9835 participants (177 events) are required to precisely estimate the calibration and discrimination measures, with this number driven by the calibration slope criterion, which we anticipate will often be the case. Also, 6443 participants (116 events) are required to precisely estimate net benefit at a risk threshold of 8%. Software code is provided.

AB - In prediction model research, external validation is needed to examine an existing model's performance using data independent to that for model development. Current external validation studies often suffer from small sample sizes and consequently imprecise predictive performance estimates. To address this, we propose how to determine the minimum sample size needed for a new external validation study of a prediction model for a binary outcome. Our calculations aim to precisely estimate calibration (Observed/Expected and calibration slope), discrimination (C-statistic), and clinical utility (net benefit). For each measure, we propose closed-form and iterative solutions for calculating the minimum sample size required. These require specifying: (i) target SEs (confidence interval widths) for each estimate of interest, (ii) the anticipated outcome event proportion in the validation population, (iii) the prediction model's anticipated (mis)calibration and variance of linear predictor values in the validation population, and (iv) potential risk thresholds for clinical decision-making. The calculations can also be used to inform whether the sample size of an existing (already collected) dataset is adequate for external validation. We illustrate our proposal for external validation of a prediction model for mechanical heart valve failure with an expected outcome event proportion of 0.018. Calculations suggest at least 9835 participants (177 events) are required to precisely estimate the calibration and discrimination measures, with this number driven by the calibration slope criterion, which we anticipate will often be the case. Also, 6443 participants (116 events) are required to precisely estimate net benefit at a risk threshold of 8%. Software code is provided.

KW - binary outcomes

KW - calibration

KW - discrimination

KW - external validation

KW - minimum sample size

KW - multivariable prediction model

KW - net benefit

U2 - 10.1002/sim.9025

DO - 10.1002/sim.9025

M3 - Article

C2 - 34031906

SN - 0277-6715

VL - 40

SP - 4230

EP - 4251

JO - Statistics in Medicine

JF - Statistics in Medicine

IS - 19

ER -

Minimum sample size for external validation of a clinical prediction model with a binary outcome

Abstract

Bibliographical note

Keywords

Access to Document

Fingerprint

Cite this