Automated and non-intrusive provenance capture with UML2PROV

Carlos Sáenz Adán , Francisco José García Izquierdo , Beatriz Pérez*, Trung Dong Huynh, Luc Moreau

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

100013 Downloads (Pure)

Abstract

Data provenance is a form of knowledge graph providing an account of what a system performs, describing the data involved, and the processes carried out over them. It is crucial to ascertaining the origin of data, validating their quality, auditing applications behaviours, and, ultimately, making them accountable. However, instrumenting applications, especially legacy ones, to track the provenance of their operations remains a significant technical hurdle, hindering the adoption of provenance technology. UML2PROV is a software-engineering methodology that facilitates the instrumentation of provenance recording in applications designed with UML diagrams. It automates the generation of (1) templates for the provenance to be recorded and (2) the code to capture values required to instantiate those templates from an application at run time, both from the application’s UML diagrams. By so doing, UML2PROV frees application developers from manual instrumentation of provenance capturing while ensuring the quality of recorded provenance.
In this paper, we present in detail UML2PROV’s approach to generating application code for capturing provenance values via the means of Bindings Generation Module (BGM). In particular, we propose a set of requirements for BGM implementations and describe an event-based design of BGM that relies on the Aspect-Oriented Programming (AOP) paradigm to automatically weave the generated code into an application. Finally, we present three different BGM implementations following the above design and analyze their pros and cons in terms of computing/storage overheads and implications to provenance consumers.
Original languageEnglish
JournalCOMPUTING
DOIs
Publication statusPublished - 10 Dec 2021

Keywords

  • data provenance
  • application logging
  • technology audits
  • UML
  • PROV
  • UML2PROV

Fingerprint

Dive into the research topics of 'Automated and non-intrusive provenance capture with UML2PROV'. Together they form a unique fingerprint.

Cite this