Achieving reproducibility by combining provenance with service and workflow versioning

Simon Woodman*, Hugo Hiden, Paul Watson, Paolo Missier

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Capturing and exploiting provenance information is considered to be important across a range of scientific, medical, commercial and Web applications [1, 2, 3, 4], including recent trends towards publishing provenance-rich, executable papers [5]. This article shows how the range of useful questions that provenance can answer is greatly increased when it is encapsulated into a system that can store and execute both current and old versions of workflows and services. e-Science Central provides a scalable, secure cloud platform for application developers. They can use it to upload data - for storage on the cloud - and services, which can be written in a variety of languages. These services can then be combined through workflows which are enacted in the cloud to compute over the data. When a workflow runs, a complete provenance trace is recorded. This paper shows how this provenance trace, used in conjunction with the ability to execute old versions of services and workflows (rather than just the latest versions) can provide useful information that would otherwise not be possible, including the key ability to reproduce experiments and to compare the effects of old and new versions of services on computations.

Original languageEnglish
Title of host publicationWORKS'11 - Proceedings of the 6th Workshop on Workflows in Support of Large-Scale Science, Co-located with SC'11
Pages127-136
Number of pages10
DOIs
Publication statusPublished - 2011
Event6th Workshop on Workflows in Support of Large-Scale Science, WORKS'11, Co-located with SC'11 - Seattle, WA, United States
Duration: 14 Nov 201114 Nov 2011

Publication series

NameWORKS'11 - Proceedings of the 6th Workshop on Workflows in Support of Large-Scale Science, Co-located with SC'11

Conference

Conference6th Workshop on Workflows in Support of Large-Scale Science, WORKS'11, Co-located with SC'11
Country/TerritoryUnited States
CitySeattle, WA
Period14/11/1114/11/11

Keywords

  • E-science
  • Provenance
  • Reproducability

ASJC Scopus subject areas

  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Achieving reproducibility by combining provenance with service and workflow versioning'. Together they form a unique fingerprint.

Cite this