A Large-Scale Empirical Evaluation of Cross-Validation and External Test Set Validation in (Q)SAR

Martin Gütlein; Christoph Helma; Andreas Karwath; Stefan Kramer

doi:10.1002/minf.201200134

A Large-Scale Empirical Evaluation of Cross-Validation and External Test Set Validation in (Q)SAR

Martin Gütlein, Christoph Helma, Andreas Karwath, Stefan Kramer

Research output: Contribution to journal › Article › peer-review

26 Citations (Scopus)

Abstract

(Q)SAR model validation is essential to ensure the quality of inferred models and to indicate future model predictivity on unseen compounds. Proper validation is also one of the requirements of regulatory authorities in order to accept the (Q)SAR model, and to approve its use in real world scenarios as alternative testing method. However, at the same time, the question of how to validate a (Q)SAR model, in particular whether to employ variants of cross-validation or external test set validation, is still under discussion. In this paper, we empirically compare a k-fold cross-validation with external test set validation. To this end we introduce a workflow allowing to realistically simulate the common problem setting of building predictive models for relatively small datasets. The workflow allows to apply the built and validated models on large amounts of unseen data, and to compare the performance of the different validation approaches. The experimental results indicate that cross-validation produces higher performant (Q)SAR models than external test set validation, reduces the variance of the results, while at the same time underestimates the performance on unseen compounds. The experimental results reported in this paper suggest that, contrary to current conception in the community, cross-validation may play a significant role in evaluating the predictivity of (Q)SAR models.

Original language	English
Pages (from-to)	516-528
Number of pages	13
Journal	Molecular Informatics
Volume	32
Issue number	5-6
DOIs	https://doi.org/10.1002/minf.201200134
Publication status	Published - 2013

Keywords

cheminformatics, crossvalidation, external validation, QSAR

Access to Document

10.1002/minf.201200134

http://onlinelibrary.wiley.com/doi/10.1002/minf.201200134/abstract

Cite this

@article{23ba99a6e64943aba3657bee246540f4,

title = "A Large-Scale Empirical Evaluation of Cross-Validation and External Test Set Validation in (Q)SAR",

abstract = "(Q)SAR model validation is essential to ensure the quality of inferred models and to indicate future model predictivity on unseen compounds. Proper validation is also one of the requirements of regulatory authorities in order to accept the (Q)SAR model, and to approve its use in real world scenarios as alternative testing method. However, at the same time, the question of how to validate a (Q)SAR model, in particular whether to employ variants of cross-validation or external test set validation, is still under discussion. In this paper, we empirically compare a k-fold cross-validation with external test set validation. To this end we introduce a workflow allowing to realistically simulate the common problem setting of building predictive models for relatively small datasets. The workflow allows to apply the built and validated models on large amounts of unseen data, and to compare the performance of the different validation approaches. The experimental results indicate that cross-validation produces higher performant (Q)SAR models than external test set validation, reduces the variance of the results, while at the same time underestimates the performance on unseen compounds. The experimental results reported in this paper suggest that, contrary to current conception in the community, cross-validation may play a significant role in evaluating the predictivity of (Q)SAR models.",

keywords = "cheminformatics, crossvalidation, external validation, QSAR",

author = "Martin G{\"u}tlein and Christoph Helma and Andreas Karwath and Stefan Kramer",

year = "2013",

doi = "10.1002/minf.201200134",

language = "English",

volume = "32",

pages = "516--528",

journal = "Molecular Informatics",

issn = "1868-1743",

publisher = "Wiley - V C H Verlag GmbH & Co. KGaA",

number = "5-6",

}

TY - JOUR

T1 - A Large-Scale Empirical Evaluation of Cross-Validation and External Test Set Validation in (Q)SAR

AU - Gütlein, Martin

AU - Helma, Christoph

AU - Karwath, Andreas

AU - Kramer, Stefan

PY - 2013

Y1 - 2013

N2 - (Q)SAR model validation is essential to ensure the quality of inferred models and to indicate future model predictivity on unseen compounds. Proper validation is also one of the requirements of regulatory authorities in order to accept the (Q)SAR model, and to approve its use in real world scenarios as alternative testing method. However, at the same time, the question of how to validate a (Q)SAR model, in particular whether to employ variants of cross-validation or external test set validation, is still under discussion. In this paper, we empirically compare a k-fold cross-validation with external test set validation. To this end we introduce a workflow allowing to realistically simulate the common problem setting of building predictive models for relatively small datasets. The workflow allows to apply the built and validated models on large amounts of unseen data, and to compare the performance of the different validation approaches. The experimental results indicate that cross-validation produces higher performant (Q)SAR models than external test set validation, reduces the variance of the results, while at the same time underestimates the performance on unseen compounds. The experimental results reported in this paper suggest that, contrary to current conception in the community, cross-validation may play a significant role in evaluating the predictivity of (Q)SAR models.

AB - (Q)SAR model validation is essential to ensure the quality of inferred models and to indicate future model predictivity on unseen compounds. Proper validation is also one of the requirements of regulatory authorities in order to accept the (Q)SAR model, and to approve its use in real world scenarios as alternative testing method. However, at the same time, the question of how to validate a (Q)SAR model, in particular whether to employ variants of cross-validation or external test set validation, is still under discussion. In this paper, we empirically compare a k-fold cross-validation with external test set validation. To this end we introduce a workflow allowing to realistically simulate the common problem setting of building predictive models for relatively small datasets. The workflow allows to apply the built and validated models on large amounts of unseen data, and to compare the performance of the different validation approaches. The experimental results indicate that cross-validation produces higher performant (Q)SAR models than external test set validation, reduces the variance of the results, while at the same time underestimates the performance on unseen compounds. The experimental results reported in this paper suggest that, contrary to current conception in the community, cross-validation may play a significant role in evaluating the predictivity of (Q)SAR models.

KW - cheminformatics, crossvalidation, external validation, QSAR

U2 - 10.1002/minf.201200134

DO - 10.1002/minf.201200134

M3 - Article

SN - 1868-1743

VL - 32

SP - 516

EP - 528

JO - Molecular Informatics

JF - Molecular Informatics

IS - 5-6

ER -

A Large-Scale Empirical Evaluation of Cross-Validation and External Test Set Validation in (Q)SAR

Abstract

Keywords

Access to Document

Fingerprint

Cite this