Abstract
Class imbalance is a pervasive challenge in real-world machine learning (ML) applications, where the minority class, often the class of interest, is significantly underrepresented. This imbalance can degrade model performance, result in misleading evaluation metrics, and complicate validation processes. Two prominent data-augmentation techniques to address class imbalance are the Synthetic Minority Oversampling Technique (SMOTE) and Generative Adversarial Networks (GAN). However, both techniques have inherent limitations, motivating the emergence of novel variants designed to overcome these challenges. While previous reviews have typically focused on specific domains, conventional methodologies, or broad strategy overviews, this review presents a unified taxonomy that outlines the causes, types, and implications of class imbalance across diverse ML tasks. It further examines emerging trends in the application of SMOTE and GAN techniques, their limitations, and hybrid adaptations. By categorising imbalance types and analysing models, metrics, datasets, and comparative approaches, this review provides actionable insights and identifies future research directions for practitioners and researchers working to address class imbalance in real-world ML tasks.
| Original language | English |
|---|---|
| Pages (from-to) | 113838-113853 |
| Number of pages | 16 |
| Journal | IEEE Access |
| Volume | 13 |
| Early online date | 1 Jul 2025 |
| DOIs | |
| Publication status | Published - 9 Jul 2025 |
Keywords
- Class imbalance
- data-augmentation
- generative adversarial networks
- machine learning
- SMOTE
ASJC Scopus subject areas
- General Computer Science
- General Materials Science
- General Engineering
Fingerprint
Dive into the research topics of 'Emerging SMOTE and GAN Variants for Data Augmentation in Imbalance Machine Learning Tasks: A Review'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver