Multiclass imbalance problems: Analysis and potential solutions

Shuo Wang*, Xin Yao

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

431 Citations (Scopus)

Abstract

Class imbalance problems have drawn growing interest recently because of their classification difficulty caused by the imbalanced class distributions. In particular, many ensemble methods have been proposed to deal with such imbalance. However, most efforts so far are only focused on two-class imbalance problems. There are unsolved issues in multiclass imbalance problems, which exist in real-world applications. This paper studies the challenges posed by the multiclass imbalance problems and investigates the generalization ability of some ensemble solutions, including our recently proposed algorithm AdaBoost.NC, with the aim of handling multiclass and imbalance effectively and directly. We first study the impact of multiminority and multimajority on the performance of two basic resampling techniques. They both present strong negative effects. "Multimajority" tends to be more harmful to the generalization performance. Motivated by the results, we then apply AdaBoost.NC to several real-world multiclass imbalance tasks and compare it to other popular ensemble methods. AdaBoost.NC is shown to be better at recognizing minority class examples and balancing the performance among classes in terms of G-mean without using any class decomposition.

Original languageEnglish
Article number6170916
Pages (from-to)1119-1130
Number of pages12
JournalIEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Volume42
Issue number4
DOIs
Publication statusPublished - 2012

Bibliographical note

Funding Information:
Manuscript received March 14, 2011; revised October 12, 2011; accepted January 4, 2012. Date of publication March 16, 2012; date of current version July 13, 2012. This work was supported by an ORS Award, an EPSRC Grant (No. EP/D052785/1), and a European FP7 Grant (No. 270428). This paper was recommended by Associate Editor N. Chawla.

Keywords

  • Boosting
  • diversity
  • ensemble learning
  • multiclass imbalance problems
  • negative correlation learning

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Software
  • Information Systems
  • Human-Computer Interaction
  • Computer Science Applications
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Multiclass imbalance problems: Analysis and potential solutions'. Together they form a unique fingerprint.

Cite this