TY - GEN
T1 - Achieving reproducibility by combining provenance with service and workflow versioning
AU - Woodman, Simon
AU - Hiden, Hugo
AU - Watson, Paul
AU - Missier, Paolo
PY - 2011
Y1 - 2011
N2 - Capturing and exploiting provenance information is considered to be important across a range of scientific, medical, commercial and Web applications [1, 2, 3, 4], including recent trends towards publishing provenance-rich, executable papers [5]. This article shows how the range of useful questions that provenance can answer is greatly increased when it is encapsulated into a system that can store and execute both current and old versions of workflows and services. e-Science Central provides a scalable, secure cloud platform for application developers. They can use it to upload data - for storage on the cloud - and services, which can be written in a variety of languages. These services can then be combined through workflows which are enacted in the cloud to compute over the data. When a workflow runs, a complete provenance trace is recorded. This paper shows how this provenance trace, used in conjunction with the ability to execute old versions of services and workflows (rather than just the latest versions) can provide useful information that would otherwise not be possible, including the key ability to reproduce experiments and to compare the effects of old and new versions of services on computations.
AB - Capturing and exploiting provenance information is considered to be important across a range of scientific, medical, commercial and Web applications [1, 2, 3, 4], including recent trends towards publishing provenance-rich, executable papers [5]. This article shows how the range of useful questions that provenance can answer is greatly increased when it is encapsulated into a system that can store and execute both current and old versions of workflows and services. e-Science Central provides a scalable, secure cloud platform for application developers. They can use it to upload data - for storage on the cloud - and services, which can be written in a variety of languages. These services can then be combined through workflows which are enacted in the cloud to compute over the data. When a workflow runs, a complete provenance trace is recorded. This paper shows how this provenance trace, used in conjunction with the ability to execute old versions of services and workflows (rather than just the latest versions) can provide useful information that would otherwise not be possible, including the key ability to reproduce experiments and to compare the effects of old and new versions of services on computations.
KW - E-science
KW - Provenance
KW - Reproducability
UR - http://www.scopus.com/inward/record.url?scp=84857934962&partnerID=8YFLogxK
U2 - 10.1145/2110497.2110512
DO - 10.1145/2110497.2110512
M3 - Conference contribution
AN - SCOPUS:84857934962
SN - 9781450311007
T3 - WORKS'11 - Proceedings of the 6th Workshop on Workflows in Support of Large-Scale Science, Co-located with SC'11
SP - 127
EP - 136
BT - WORKS'11 - Proceedings of the 6th Workshop on Workflows in Support of Large-Scale Science, Co-located with SC'11
T2 - 6th Workshop on Workflows in Support of Large-Scale Science, WORKS'11, Co-located with SC'11
Y2 - 14 November 2011 through 14 November 2011
ER -