Abstract
There is a growing awareness among life scientists of the variability in quality of the data in public repositories, and of the threat that poor data quality poses to the validity of experimental results. No standards are available, however, for computing quality levels in this data domain. We argue that data processing environments used by life scientists should feature facilities for expressing and applying quality-based, personal data acceptability criteria. We propose a framework for the specification of users’ quality processing requirements, called quality views. These views are compiled and semi-automatically embedded within the data processing environment. The result is a quality management toolkit that promotes rapid prototyping and reuse of quality components. We illustrate the utility of the framework by showing how it can be deployed within Taverna, a scientific workflow management tool, and applied to actual workflows for data analysis in proteomics.
Original language | English |
---|---|
Title of host publication | VLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases |
Publisher | Association for Computing Machinery |
Pages | 977-988 |
Number of pages | 12 |
ISBN (Print) | 1595933859, 9781595933850 |
Publication status | Published - 2006 |
Event | 32nd International Conference on Very Large Data Bases, VLDB 2006 - Seoul, Korea, Republic of Duration: 12 Sept 2006 → 15 Sept 2006 |
Publication series
Name | VLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases |
---|
Conference
Conference | 32nd International Conference on Very Large Data Bases, VLDB 2006 |
---|---|
Country/Territory | Korea, Republic of |
City | Seoul |
Period | 12/09/06 → 15/09/06 |
Bibliographical note
Publisher Copyright:© 2006 Association for Computing Machinery. All rights reserved.
ASJC Scopus subject areas
- Hardware and Architecture
- Information Systems
- Software
- Information Systems and Management