Ensemble Learning for Data Stream Analysis: a survey

B. KRAWCZYK, L.L. MINKU, J. GAMA, J. STEFANOWSKI, M. WOZNIAK

Research output: Contribution to journalArticlepeer-review

406 Citations (Scopus)

Abstract

In many applications of information systems learning algorithms have to act in dynamic environments where data are collected in the form of transient data streams. Compared to static data mining, processing streams imposes new computational requirements for algorithms to incrementally process incoming examples while using limited memory and time. Furthermore, due to the non-stationary characteristics of streaming data, prediction models are often also required to adapt to concept drifts. Out of several new proposed stream algorithms, ensembles play an important role, in particular for non-stationary environments. This paper surveys research on ensembles for data stream classification as well as regression tasks. Besides presenting a comprehensive spectrum of ensemble approaches for data streams, we also discuss advanced learning concepts such as imbalanced data streams, novelty detection, active and semi-supervised learning, complex data representations and structured outputs. The paper concludes with a discussion of open research problems and lines of future research.
Original languageEnglish
Pages (from-to)132-156
Number of pages25
JournalInformation Fusion
Volume37
Early online date3 Feb 2017
DOIs
Publication statusPublished - Sept 2017

Fingerprint

Dive into the research topics of 'Ensemble Learning for Data Stream Analysis: a survey'. Together they form a unique fingerprint.

Cite this