Provenance annotation and analysis to support process re-computation

Jacek Cała*, Paolo Missier

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Many resource-intensive analytics processes evolve over time following new versions of the reference datasets and software dependencies they use. We focus on scenarios in which any version change has the potential to affect many outcomes, as is the case for instance in high throughput genomics where the same process is used to analyse large cohorts of patient genomes, or cases. As any version change is unlikely to affect the entire population, an efficient strategy for restoring the currency of the outcomes requires first to identify the scope of a change, i.e., the subset of affected data products. In this paper we describe a generic and reusable provenance-based approach to address this scope discovery problem. It applies to a scenario where the process consists of complex hierarchical components, where different input cases are processed using different version configurations of each component, and where separate provenance traces are collected for the executions of each of the components. We show how a new data structure, called a restart tree, is computed and exploited to manage the change scope discovery problem.

Original languageEnglish
Title of host publicationProvenance and Annotation of Data and Processes - 7th International Provenance and Annotation Workshop, IPAW 2018, Proceedings
EditorsKhalid Belhajjame, Ashish Gehani, Pinar Alper
PublisherSpringer Verlag
Pages3-15
Number of pages13
ISBN (Print)9783319983783
DOIs
Publication statusPublished - 2018
Event7th International Provenance and Annotation Workshop, IPAW 2018 - London, United Kingdom
Duration: 9 Jul 201810 Jul 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11017 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference7th International Provenance and Annotation Workshop, IPAW 2018
Country/TerritoryUnited Kingdom
CityLondon
Period9/07/1810/07/18

Bibliographical note

Publisher Copyright:
© Springer Nature Switzerland AG 2018.

Keywords

  • Process re-computation
  • Provenance annotations

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Provenance annotation and analysis to support process re-computation'. Together they form a unique fingerprint.

Cite this