Minimum sample size calculations for external validation of a clinical prediction model with a time-to-event outcome

Richard D Riley; Gary S Collins; Joie Ensor; Lucinda Archer; Sarah Booth; Sarwar I Mozumder; Mark J Rutherford; Maarten van Smeden; Paul C Lambert; Kym I E Snell

doi:10.1002/sim.9275

Minimum sample size calculations for external validation of a clinical prediction model with a time-to-event outcome

Richard D Riley^*, Gary S Collins, Joie Ensor, Lucinda Archer, Sarah Booth, Sarwar I Mozumder, Mark J Rutherford, Maarten van Smeden, Paul C Lambert, Kym I E Snell

^*Corresponding author for this work

Applied Health Research

Research output: Contribution to journal › Article › peer-review

24 Downloads (Pure)

Abstract

Previous articles in Statistics in Medicine describe how to calculate the sample size required for external validation of prediction models with continuous and binary outcomes. The minimum sample size criteria aim to ensure precise estimation of key measures of a model's predictive performance, including measures of calibration, discrimination, and net benefit. Here, we extend the sample size guidance to prediction models with a time-to-event (survival) outcome, to cover external validation in datasets containing censoring. A simulation-based framework is proposed, which calculates the sample size required to target a particular confidence interval width for the calibration slope measuring the agreement between predicted risks (from the model) and observed risks (derived using pseudo-observations to account for censoring) on the log cumulative hazard scale. Precise estimation of calibration curves, discrimination, and net-benefit can also be checked in this framework. The process requires assumptions about the validation population in terms of the (i) distribution of the model's linear predictor and (ii) event and censoring distributions. Existing information can inform this; in particular, the linear predictor distribution can be approximated using the C-index or Royston's D statistic from the model development article, together with the overall event risk. We demonstrate how the approach can be used to calculate the sample size required to validate a prediction model for recurrent venous thromboembolism. Ideally the sample size should ensure precise calibration across the entire range of predicted risks, but must at least ensure adequate precision in regions important for clinical decision-making. Stata and R code are provided.

Original language	English
Pages (from-to)	1280-1295
Number of pages	16
Journal	Statistics in Medicine
Volume	41
Issue number	7
Early online date	16 Dec 2021
DOIs	https://doi.org/10.1002/sim.9275
Publication status	Published - 30 Mar 2022

Bibliographical note

Keywords

Calibration
external validation
prediction model
sample size
time-to-event & survival data

Access to Document

10.1002/sim.9275Licence: Creative Commons: Attribution (CC BY)

RileyR2021MinimumFinal published version, 2.72 MBLicence: Creative Commons: Attribution (CC BY)

Cite this

@article{c4c30bea19134622bc5b3760eb9021f6,

title = "Minimum sample size calculations for external validation of a clinical prediction model with a time-to-event outcome",

abstract = "Previous articles in Statistics in Medicine describe how to calculate the sample size required for external validation of prediction models with continuous and binary outcomes. The minimum sample size criteria aim to ensure precise estimation of key measures of a model's predictive performance, including measures of calibration, discrimination, and net benefit. Here, we extend the sample size guidance to prediction models with a time-to-event (survival) outcome, to cover external validation in datasets containing censoring. A simulation-based framework is proposed, which calculates the sample size required to target a particular confidence interval width for the calibration slope measuring the agreement between predicted risks (from the model) and observed risks (derived using pseudo-observations to account for censoring) on the log cumulative hazard scale. Precise estimation of calibration curves, discrimination, and net-benefit can also be checked in this framework. The process requires assumptions about the validation population in terms of the (i) distribution of the model's linear predictor and (ii) event and censoring distributions. Existing information can inform this; in particular, the linear predictor distribution can be approximated using the C-index or Royston's D statistic from the model development article, together with the overall event risk. We demonstrate how the approach can be used to calculate the sample size required to validate a prediction model for recurrent venous thromboembolism. Ideally the sample size should ensure precise calibration across the entire range of predicted risks, but must at least ensure adequate precision in regions important for clinical decision-making. Stata and R code are provided.",

keywords = "Calibration, external validation, prediction model, sample size, time-to-event & survival data",

author = "Riley, {Richard D} and Collins, {Gary S} and Joie Ensor and Lucinda Archer and Sarah Booth and Mozumder, {Sarwar I} and Rutherford, {Mark J} and {van Smeden}, Maarten and Lambert, {Paul C} and Snell, {Kym I E}",

note = "{\textcopyright} 2021 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.",

year = "2022",

month = mar,

day = "30",

doi = "10.1002/sim.9275",

language = "English",

volume = "41",

pages = "1280--1295",

journal = "Statistics in Medicine",

issn = "0277-6715",

publisher = "Wiley",

number = "7",

}

TY - JOUR

T1 - Minimum sample size calculations for external validation of a clinical prediction model with a time-to-event outcome

AU - Riley, Richard D

AU - Collins, Gary S

AU - Ensor, Joie

AU - Archer, Lucinda

AU - Booth, Sarah

AU - Mozumder, Sarwar I

AU - Rutherford, Mark J

AU - van Smeden, Maarten

AU - Lambert, Paul C

AU - Snell, Kym I E

PY - 2022/3/30

Y1 - 2022/3/30

N2 - Previous articles in Statistics in Medicine describe how to calculate the sample size required for external validation of prediction models with continuous and binary outcomes. The minimum sample size criteria aim to ensure precise estimation of key measures of a model's predictive performance, including measures of calibration, discrimination, and net benefit. Here, we extend the sample size guidance to prediction models with a time-to-event (survival) outcome, to cover external validation in datasets containing censoring. A simulation-based framework is proposed, which calculates the sample size required to target a particular confidence interval width for the calibration slope measuring the agreement between predicted risks (from the model) and observed risks (derived using pseudo-observations to account for censoring) on the log cumulative hazard scale. Precise estimation of calibration curves, discrimination, and net-benefit can also be checked in this framework. The process requires assumptions about the validation population in terms of the (i) distribution of the model's linear predictor and (ii) event and censoring distributions. Existing information can inform this; in particular, the linear predictor distribution can be approximated using the C-index or Royston's D statistic from the model development article, together with the overall event risk. We demonstrate how the approach can be used to calculate the sample size required to validate a prediction model for recurrent venous thromboembolism. Ideally the sample size should ensure precise calibration across the entire range of predicted risks, but must at least ensure adequate precision in regions important for clinical decision-making. Stata and R code are provided.

AB - Previous articles in Statistics in Medicine describe how to calculate the sample size required for external validation of prediction models with continuous and binary outcomes. The minimum sample size criteria aim to ensure precise estimation of key measures of a model's predictive performance, including measures of calibration, discrimination, and net benefit. Here, we extend the sample size guidance to prediction models with a time-to-event (survival) outcome, to cover external validation in datasets containing censoring. A simulation-based framework is proposed, which calculates the sample size required to target a particular confidence interval width for the calibration slope measuring the agreement between predicted risks (from the model) and observed risks (derived using pseudo-observations to account for censoring) on the log cumulative hazard scale. Precise estimation of calibration curves, discrimination, and net-benefit can also be checked in this framework. The process requires assumptions about the validation population in terms of the (i) distribution of the model's linear predictor and (ii) event and censoring distributions. Existing information can inform this; in particular, the linear predictor distribution can be approximated using the C-index or Royston's D statistic from the model development article, together with the overall event risk. We demonstrate how the approach can be used to calculate the sample size required to validate a prediction model for recurrent venous thromboembolism. Ideally the sample size should ensure precise calibration across the entire range of predicted risks, but must at least ensure adequate precision in regions important for clinical decision-making. Stata and R code are provided.

KW - Calibration

KW - external validation

KW - prediction model

KW - sample size

KW - time-to-event & survival data

U2 - 10.1002/sim.9275

DO - 10.1002/sim.9275

M3 - Article

C2 - 34915593

SN - 0277-6715

VL - 41

SP - 1280

EP - 1295

JO - Statistics in Medicine

JF - Statistics in Medicine

IS - 7

ER -

Minimum sample size calculations for external validation of a clinical prediction model with a time-to-event outcome

Abstract

Bibliographical note

Keywords

Access to Document

Fingerprint

Cite this