Resample-based ensemble framework for drifting imbalanced data streams

Research output: Contribution to journalArticle

Authors

  • Hang Zhang
  • Weike Liu
  • Shuo Wang
  • Jicheng Shan
  • Qingbao Liu

Colleges, School and Institutes

External organisations

  • Birmingham City University
  • National University of Defense Technology
  • Lanzhou University

Abstract

Machine learning in real-world scenarios is often challenged by concept drift and class imbalance. This paper proposes a Resample-based Ensemble Framework for Drifting Imbalanced Stream (RE-DI). The ensemble framework consists of a long-term static classifier to handle gradual and multiple dynamic classifiers to handle sudden concept drift. The weights of the ensemble classifier are adjusted from two aspects. First, a time-decayed strategy decreases the weights of the dynamic classifiers to make the ensemble classifier focus more on the new concept of the data stream. Second, a novel reinforcement mechanism is proposed to increase the weights of the base classifiers that perform better on the minority class and decrease the weights of the classifiers that perform worse. A resampling buffer is used for storing the instances of the minority class to balance the imbalanced distribution over time. In our experiment, we compare the proposed method with other state-of-the-art algorithms on both real-world and synthetic data streams. The results show that the proposed method achieves the best performance in terms of both Prequential AUC and accuracy.

Details

Original languageEnglish
Pages (from-to)65103-65115
Number of pages13
JournalIEEE Access
Volume7
Publication statusPublished - 6 May 2019

Keywords

  • online ensemble learning, resample learning, reinforcement, concept drift, class imbalance, Online ensemble learning