Tackling virtual and real concept drifts: an adaptive Gaussian mixture model approach

Research output: Contribution to journalArticlepeer-review

Authors

Colleges, School and Institutes

External organisations

  • Universidade Federal de Pernambuco

Abstract

Real-world applications have been dealing with large amounts of data that arrive over time and generally present changes in their underlying joint probability distribution, i.e., concept drift. Concept drift can be subdivided into two types: virtual drift, which affects the unconditional probability distribution p(x), and real drift, which affects the conditional probability distribution p(y|x). Existing works focuses on real drift. However, strategies to cope with real drift may not be the best suited for dealing with virtual drift, since the real class boundaries remain unchanged. We provide the first in depth analysis of the differences between the impact of virtual and real drifts on classifiers' suitability. We propose an approach to handle both drifts called On-line Gaussian Mixture Model With Noise Filter For Handling Virtual and Real Concept Drifts (OGMMF-VRD). Experiments with seven synthetics and seven real-world datasets show that OGMMF-VRD outperforms other approaches with separate mechanisms to deal with virtual and real drifts. It also has more stable rankings and smaller drops in performance during drifting periods than existing ensemble approaches, thus being more reliable for adoption in practice.

Details

Original languageEnglish
JournalIEEE Transactions on Knowledge and Data Engineering
Early online date29 Jul 2021
Publication statusE-pub ahead of print - 29 Jul 2021

Keywords

  • Data Streams, Virtual Concept Drift, Real Concept Drift, Gaussian Mixture Model