Measuring data completeness for microbial genomics database

Nurul A. Emran*, Suzanne Embury, Paolo Missier, Mohd Noor Mat Isa, Azah Kamilah Muda

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Poor quality data such as data with missing values (or records) cause negative consequences in many application domains. An important aspect of data quality is completeness. One problem in data completeness is the problem of missing individuals in data sets. Within a data set, the individuals refer to the real world entities whose information is recorded. So far, in completeness studies however, there has been little discussion about how missing individuals are assessed. In this paper, we propose the notion of population-based completeness (PBC) that deals with the missing individuals problem, with the aim of investigating what is required to measure PBC and to identify what is needed to support PBC measurements in practice. This paper explores the need of PBC in the microbial genomics where real sample data sets retrieved from a microbial database called Comprehensive Microbial Resources are used (CMR).

Original languageEnglish
Title of host publicationIntelligent Information and Database Systems - 5th Asian Conference, ACIIDS 2013, Proceedings
Pages186-195
Number of pages10
EditionPART 1
DOIs
Publication statusPublished - 2013
Event5th Asian Conference on Intelligent Information and Database Systems, ACIIDS 2013 - Kuala Lumpur, Malaysia
Duration: 18 Mar 201320 Mar 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume7802 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference5th Asian Conference on Intelligent Information and Database Systems, ACIIDS 2013
Country/TerritoryMalaysia
CityKuala Lumpur
Period18/03/1320/03/13

Keywords

  • completeness measurement
  • data completeness
  • population-based completeness (PBC)

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Measuring data completeness for microbial genomics database'. Together they form a unique fingerprint.

Cite this