Developing more generalizable prediction models from pooled studies and large clustered data sets

V.M.T. de Jong; K.G.M. Moons; M.J.C. Eijkemans; R.D. Riley; T.P.A. Debray

doi:10.1002/sim.8981

Developing more generalizable prediction models from pooled studies and large clustered data sets

V.M.T. de Jong^*, K.G.M. Moons, M.J.C. Eijkemans, R.D. Riley, T.P.A. Debray

^*Corresponding author for this work

Applied Health Research

Research output: Contribution to journal › Article › peer-review

35 Downloads (Pure)

Abstract

Prediction models often yield inaccurate predictions for new individuals. Large data sets from pooled studies or electronic healthcare records may alleviate this with an increased sample size and variability in sample characteristics. However, existing strategies for prediction model development generally do not account for heterogeneity in predictor-outcome associations between different settings and populations. This limits the generalizability of developed models (even from large, combined, clustered data sets) and necessitates local revisions. We aim to develop methodology for producing prediction models that require less tailoring to different settings and populations. We adopt internal-external cross-validation to assess and reduce heterogeneity in models' predictive performance during the development. We propose a predictor selection algorithm that optimizes the (weighted) average performance while minimizing its variability across the hold-out clusters (or studies). Predictors are added iteratively until the estimated generalizability is optimized. We illustrate this by developing a model for predicting the risk of atrial fibrillation and updating an existing one for diagnosing deep vein thrombosis, using individual participant data from 20 cohorts (N = 10 873) and 11 diagnostic studies (N = 10 014), respectively. Meta-analysis of calibration and discrimination performance in each hold-out cluster shows that trade-offs between average and heterogeneity of performance occurred. Our methodology enables the assessment of heterogeneity of prediction model performance during model development in multiple or clustered data sets, thereby informing researchers on predictor selection to improve the generalizability to different settings and populations, and reduce the need for model tailoring. Our methodology has been implemented in the R package metamisc.

Original language	English
Pages (from-to)	533-3559
Number of pages	27
Journal	Statistics in Medicine
Volume	40
Issue number	15
Early online date	5 May 2021
DOIs	https://doi.org/10.1002/sim.8981
Publication status	Published - 10 Jul 2021

Keywords

heterogeneity
individual participant data
internal-external cross-validation
prediction

Access to Document

10.1002/sim.8981Licence: Creative Commons: Attribution (CC BY)

JongV2021DevelopingFinal published version, 1.26 MBLicence: Creative Commons: Attribution (CC BY)

Cite this

@article{d9d0b93afb0546f1a1e27606bfaa43b9,

title = "Developing more generalizable prediction models from pooled studies and large clustered data sets",

abstract = "Prediction models often yield inaccurate predictions for new individuals. Large data sets from pooled studies or electronic healthcare records may alleviate this with an increased sample size and variability in sample characteristics. However, existing strategies for prediction model development generally do not account for heterogeneity in predictor-outcome associations between different settings and populations. This limits the generalizability of developed models (even from large, combined, clustered data sets) and necessitates local revisions. We aim to develop methodology for producing prediction models that require less tailoring to different settings and populations. We adopt internal-external cross-validation to assess and reduce heterogeneity in models' predictive performance during the development. We propose a predictor selection algorithm that optimizes the (weighted) average performance while minimizing its variability across the hold-out clusters (or studies). Predictors are added iteratively until the estimated generalizability is optimized. We illustrate this by developing a model for predicting the risk of atrial fibrillation and updating an existing one for diagnosing deep vein thrombosis, using individual participant data from 20 cohorts (N = 10 873) and 11 diagnostic studies (N = 10 014), respectively. Meta-analysis of calibration and discrimination performance in each hold-out cluster shows that trade-offs between average and heterogeneity of performance occurred. Our methodology enables the assessment of heterogeneity of prediction model performance during model development in multiple or clustered data sets, thereby informing researchers on predictor selection to improve the generalizability to different settings and populations, and reduce the need for model tailoring. Our methodology has been implemented in the R package metamisc.",

keywords = "heterogeneity, individual participant data, internal-external cross-validation, prediction",

author = "{de Jong}, V.M.T. and K.G.M. Moons and M.J.C. Eijkemans and R.D. Riley and T.P.A. Debray",

year = "2021",

month = jul,

day = "10",

doi = "10.1002/sim.8981",

language = "English",

volume = "40",

pages = "533--3559",

journal = "Statistics in Medicine",

issn = "0277-6715",

publisher = "Wiley",

number = "15",

}

TY - JOUR

T1 - Developing more generalizable prediction models from pooled studies and large clustered data sets

AU - de Jong, V.M.T.

AU - Moons, K.G.M.

AU - Eijkemans, M.J.C.

AU - Riley, R.D.

AU - Debray, T.P.A.

PY - 2021/7/10

Y1 - 2021/7/10

N2 - Prediction models often yield inaccurate predictions for new individuals. Large data sets from pooled studies or electronic healthcare records may alleviate this with an increased sample size and variability in sample characteristics. However, existing strategies for prediction model development generally do not account for heterogeneity in predictor-outcome associations between different settings and populations. This limits the generalizability of developed models (even from large, combined, clustered data sets) and necessitates local revisions. We aim to develop methodology for producing prediction models that require less tailoring to different settings and populations. We adopt internal-external cross-validation to assess and reduce heterogeneity in models' predictive performance during the development. We propose a predictor selection algorithm that optimizes the (weighted) average performance while minimizing its variability across the hold-out clusters (or studies). Predictors are added iteratively until the estimated generalizability is optimized. We illustrate this by developing a model for predicting the risk of atrial fibrillation and updating an existing one for diagnosing deep vein thrombosis, using individual participant data from 20 cohorts (N = 10 873) and 11 diagnostic studies (N = 10 014), respectively. Meta-analysis of calibration and discrimination performance in each hold-out cluster shows that trade-offs between average and heterogeneity of performance occurred. Our methodology enables the assessment of heterogeneity of prediction model performance during model development in multiple or clustered data sets, thereby informing researchers on predictor selection to improve the generalizability to different settings and populations, and reduce the need for model tailoring. Our methodology has been implemented in the R package metamisc.

AB - Prediction models often yield inaccurate predictions for new individuals. Large data sets from pooled studies or electronic healthcare records may alleviate this with an increased sample size and variability in sample characteristics. However, existing strategies for prediction model development generally do not account for heterogeneity in predictor-outcome associations between different settings and populations. This limits the generalizability of developed models (even from large, combined, clustered data sets) and necessitates local revisions. We aim to develop methodology for producing prediction models that require less tailoring to different settings and populations. We adopt internal-external cross-validation to assess and reduce heterogeneity in models' predictive performance during the development. We propose a predictor selection algorithm that optimizes the (weighted) average performance while minimizing its variability across the hold-out clusters (or studies). Predictors are added iteratively until the estimated generalizability is optimized. We illustrate this by developing a model for predicting the risk of atrial fibrillation and updating an existing one for diagnosing deep vein thrombosis, using individual participant data from 20 cohorts (N = 10 873) and 11 diagnostic studies (N = 10 014), respectively. Meta-analysis of calibration and discrimination performance in each hold-out cluster shows that trade-offs between average and heterogeneity of performance occurred. Our methodology enables the assessment of heterogeneity of prediction model performance during model development in multiple or clustered data sets, thereby informing researchers on predictor selection to improve the generalizability to different settings and populations, and reduce the need for model tailoring. Our methodology has been implemented in the R package metamisc.

KW - heterogeneity

KW - individual participant data

KW - internal-external cross-validation

KW - prediction

UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85105060785&partnerID=MN8TOARS

U2 - 10.1002/sim.8981

DO - 10.1002/sim.8981

M3 - Article

SN - 0277-6715

VL - 40

SP - 533

EP - 3559

JO - Statistics in Medicine

JF - Statistics in Medicine

IS - 15

ER -

Developing more generalizable prediction models from pooled studies and large clustered data sets

Abstract

Keywords

Access to Document

Fingerprint

Cite this