Reporting methods in studies developing prognostic models in cancer: a review

Susan Mallett; Patrick Royston; Susan Dutton; Rachel Waters; Douglas G Altman

doi:10.1186/1741-7015-8-20

Reporting methods in studies developing prognostic models in cancer: a review

Susan Mallett, Patrick Royston, Susan Dutton, Rachel Waters, Douglas G Altman

Applied Health Research

Research output: Contribution to journal › Article › peer-review

119 Citations (Scopus)

Abstract

BACKGROUND: Development of prognostic models enables identification of variables that are influential in predicting patient outcome and the use of these multiple risk factors in a systematic, reproducible way according to evidence based methods. The reliability of models depends on informed use of statistical methods, in combination with prior knowledge of disease. We reviewed published articles to assess reporting and methods used to develop new prognostic models in cancer.

METHODS: We developed a systematic search string and identified articles from PubMed. Forty-seven articles were included that satisfied the following inclusion criteria: published in 2005; aiming to predict patient outcome; presenting new prognostic models in cancer with outcome time to an event and including a combination of at least two separate variables; and analysing data using multivariable analysis suitable for time to event data.

RESULTS: In 47 studies, prospective cohort or randomised controlled trial data were used for model development in only 33% (15) of studies. In 30% (14) of the studies insufficient data were available, having fewer than 10 events per variable (EPV) used in model development. EPV could not be calculated in a further 40% (19) of the studies. The coding of candidate variables was only reported in 68% (32) of the studies. Although use of continuous variables was reported in all studies, only one article reported using recommended methods of retaining all these variables as continuous without categorisation. Statistical methods for selection of variables in the multivariate modelling were often flawed. A method that is not recommended, namely, using statistical significance in univariate analysis as a pre-screening test to select variables for inclusion in the multivariate model, was applied in 48% (21) of the studies.

CONCLUSIONS: We found that published prognostic models are often characterised by both use of inappropriate methods for development of multivariable models and poor reporting. In addition, models are limited by the lack of studies based on prospective data of sufficient sample size to avoid overfitting. The use of poor methods compromises the reliability of prognostic models developed to provide objective probability estimates to complement clinical intuition of the physician and guidelines.

Original language	English
Pages (from-to)	20
Journal	BMC medicine
Volume	8
DOIs	https://doi.org/10.1186/1741-7015-8-20
Publication status	Published - 2010

Keywords

Humans
Multivariate Analysis
Neoplasms
Prognosis
Statistics as Topic

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1186/1741-7015-8-20

Cite this

@article{acc0d8169cd34f53a93f5db9f8c67983,

title = "Reporting methods in studies developing prognostic models in cancer: a review",

abstract = "BACKGROUND: Development of prognostic models enables identification of variables that are influential in predicting patient outcome and the use of these multiple risk factors in a systematic, reproducible way according to evidence based methods. The reliability of models depends on informed use of statistical methods, in combination with prior knowledge of disease. We reviewed published articles to assess reporting and methods used to develop new prognostic models in cancer.METHODS: We developed a systematic search string and identified articles from PubMed. Forty-seven articles were included that satisfied the following inclusion criteria: published in 2005; aiming to predict patient outcome; presenting new prognostic models in cancer with outcome time to an event and including a combination of at least two separate variables; and analysing data using multivariable analysis suitable for time to event data.RESULTS: In 47 studies, prospective cohort or randomised controlled trial data were used for model development in only 33% (15) of studies. In 30% (14) of the studies insufficient data were available, having fewer than 10 events per variable (EPV) used in model development. EPV could not be calculated in a further 40% (19) of the studies. The coding of candidate variables was only reported in 68% (32) of the studies. Although use of continuous variables was reported in all studies, only one article reported using recommended methods of retaining all these variables as continuous without categorisation. Statistical methods for selection of variables in the multivariate modelling were often flawed. A method that is not recommended, namely, using statistical significance in univariate analysis as a pre-screening test to select variables for inclusion in the multivariate model, was applied in 48% (21) of the studies.CONCLUSIONS: We found that published prognostic models are often characterised by both use of inappropriate methods for development of multivariable models and poor reporting. In addition, models are limited by the lack of studies based on prospective data of sufficient sample size to avoid overfitting. The use of poor methods compromises the reliability of prognostic models developed to provide objective probability estimates to complement clinical intuition of the physician and guidelines.",

keywords = "Humans, Multivariate Analysis, Neoplasms, Prognosis, Statistics as Topic",

author = "Susan Mallett and Patrick Royston and Susan Dutton and Rachel Waters and Altman, {Douglas G}",

year = "2010",

doi = "10.1186/1741-7015-8-20",

language = "English",

volume = "8",

pages = "20",

journal = "BMC medicine",

issn = "1741-7015",

publisher = "Springer",

}

TY - JOUR

T1 - Reporting methods in studies developing prognostic models in cancer

T2 - a review

AU - Mallett, Susan

AU - Royston, Patrick

AU - Dutton, Susan

AU - Waters, Rachel

AU - Altman, Douglas G

PY - 2010

Y1 - 2010

N2 - BACKGROUND: Development of prognostic models enables identification of variables that are influential in predicting patient outcome and the use of these multiple risk factors in a systematic, reproducible way according to evidence based methods. The reliability of models depends on informed use of statistical methods, in combination with prior knowledge of disease. We reviewed published articles to assess reporting and methods used to develop new prognostic models in cancer.METHODS: We developed a systematic search string and identified articles from PubMed. Forty-seven articles were included that satisfied the following inclusion criteria: published in 2005; aiming to predict patient outcome; presenting new prognostic models in cancer with outcome time to an event and including a combination of at least two separate variables; and analysing data using multivariable analysis suitable for time to event data.RESULTS: In 47 studies, prospective cohort or randomised controlled trial data were used for model development in only 33% (15) of studies. In 30% (14) of the studies insufficient data were available, having fewer than 10 events per variable (EPV) used in model development. EPV could not be calculated in a further 40% (19) of the studies. The coding of candidate variables was only reported in 68% (32) of the studies. Although use of continuous variables was reported in all studies, only one article reported using recommended methods of retaining all these variables as continuous without categorisation. Statistical methods for selection of variables in the multivariate modelling were often flawed. A method that is not recommended, namely, using statistical significance in univariate analysis as a pre-screening test to select variables for inclusion in the multivariate model, was applied in 48% (21) of the studies.CONCLUSIONS: We found that published prognostic models are often characterised by both use of inappropriate methods for development of multivariable models and poor reporting. In addition, models are limited by the lack of studies based on prospective data of sufficient sample size to avoid overfitting. The use of poor methods compromises the reliability of prognostic models developed to provide objective probability estimates to complement clinical intuition of the physician and guidelines.

AB - BACKGROUND: Development of prognostic models enables identification of variables that are influential in predicting patient outcome and the use of these multiple risk factors in a systematic, reproducible way according to evidence based methods. The reliability of models depends on informed use of statistical methods, in combination with prior knowledge of disease. We reviewed published articles to assess reporting and methods used to develop new prognostic models in cancer.METHODS: We developed a systematic search string and identified articles from PubMed. Forty-seven articles were included that satisfied the following inclusion criteria: published in 2005; aiming to predict patient outcome; presenting new prognostic models in cancer with outcome time to an event and including a combination of at least two separate variables; and analysing data using multivariable analysis suitable for time to event data.RESULTS: In 47 studies, prospective cohort or randomised controlled trial data were used for model development in only 33% (15) of studies. In 30% (14) of the studies insufficient data were available, having fewer than 10 events per variable (EPV) used in model development. EPV could not be calculated in a further 40% (19) of the studies. The coding of candidate variables was only reported in 68% (32) of the studies. Although use of continuous variables was reported in all studies, only one article reported using recommended methods of retaining all these variables as continuous without categorisation. Statistical methods for selection of variables in the multivariate modelling were often flawed. A method that is not recommended, namely, using statistical significance in univariate analysis as a pre-screening test to select variables for inclusion in the multivariate model, was applied in 48% (21) of the studies.CONCLUSIONS: We found that published prognostic models are often characterised by both use of inappropriate methods for development of multivariable models and poor reporting. In addition, models are limited by the lack of studies based on prospective data of sufficient sample size to avoid overfitting. The use of poor methods compromises the reliability of prognostic models developed to provide objective probability estimates to complement clinical intuition of the physician and guidelines.

KW - Humans

KW - Multivariate Analysis

KW - Neoplasms

KW - Prognosis

KW - Statistics as Topic

U2 - 10.1186/1741-7015-8-20

DO - 10.1186/1741-7015-8-20

M3 - Article

C2 - 20353578

SN - 1741-7015

VL - 8

SP - 20

JO - BMC medicine

JF - BMC medicine

ER -

Reporting methods in studies developing prognostic models in cancer: a review

Abstract

Keywords

UN SDGs

Access to Document

Fingerprint

Cite this