Abstract
The PROV data model assumes that entities are immutable and all changes to an entity e are represented by the creation of a new entity e’. This is reasonable for many provenance applications but may produce verbose results once we move towards fine-grained provenance due to the possibility of multiple binds (i.e., variables, elements of data structures) referring to the same mutable data objects (e.g., lists or dictionaries in Python). Changing a data object that is referenced by multiple immutable entities requires duplicating those immutable entities to keep consistency. This imposes an overhead on the provenance storage and makes it hard to represent data-changing operations and their effect on the provenance graph. In this paper, we propose a PROV extension to represent mutable data structures. We do this by adding reference derivations and checkpoints. We evaluate our approach by comparing it to plain PROV and PROV-Dictionary. Results indicate a reduction in the storage overhead for assignments and changes in data structures from O(N) and Ώ(R × N), respectively, to O(1) in both cases when compared to plain PROV (N is the number of members in the data structure and R is the number of references to the data structure).
Original language | English |
---|---|
Title of host publication | Provenance and Annotation of Data and Processes - 7th International Provenance and Annotation Workshop, IPAW 2018, Proceedings |
Editors | Khalid Belhajjame, Ashish Gehani, Pinar Alper |
Publisher | Springer Verlag |
Pages | 87-100 |
Number of pages | 14 |
ISBN (Print) | 9783319983783 |
DOIs | |
Publication status | Published - 2018 |
Event | 7th International Provenance and Annotation Workshop, IPAW 2018 - London, United Kingdom Duration: 9 Jul 2018 → 10 Jul 2018 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 11017 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 7th International Provenance and Annotation Workshop, IPAW 2018 |
---|---|
Country/Territory | United Kingdom |
City | London |
Period | 9/07/18 → 10/07/18 |
Bibliographical note
Publisher Copyright:© Springer Nature Switzerland AG 2018.
Keywords
- Interoperability
- Provenance
- Specification
ASJC Scopus subject areas
- Theoretical Computer Science
- General Computer Science