Skip to main navigation Skip to search Skip to main content

Emerging SMOTE and GAN Variants for Data Augmentation in Imbalance Machine Learning Tasks: A Review

  • Amadi G. Udu*
  • , Marwah T. Salman
  • , Maryam K. Ghalati
  • , Andrea Lecchini-Visintini
  • , David R. Siddle
  • , Hongbiao Dong
  • *Corresponding author for this work

Research output: Contribution to journalReview articlepeer-review

Abstract

Class imbalance is a pervasive challenge in real-world machine learning (ML) applications, where the minority class, often the class of interest, is significantly underrepresented. This imbalance can degrade model performance, result in misleading evaluation metrics, and complicate validation processes. Two prominent data-augmentation techniques to address class imbalance are the Synthetic Minority Oversampling Technique (SMOTE) and Generative Adversarial Networks (GAN). However, both techniques have inherent limitations, motivating the emergence of novel variants designed to overcome these challenges. While previous reviews have typically focused on specific domains, conventional methodologies, or broad strategy overviews, this review presents a unified taxonomy that outlines the causes, types, and implications of class imbalance across diverse ML tasks. It further examines emerging trends in the application of SMOTE and GAN techniques, their limitations, and hybrid adaptations. By categorising imbalance types and analysing models, metrics, datasets, and comparative approaches, this review provides actionable insights and identifies future research directions for practitioners and researchers working to address class imbalance in real-world ML tasks.

Original languageEnglish
Pages (from-to)113838-113853
Number of pages16
JournalIEEE Access
Volume13
Early online date1 Jul 2025
DOIs
Publication statusPublished - 9 Jul 2025

Keywords

  • Class imbalance
  • data-augmentation
  • generative adversarial networks
  • machine learning
  • SMOTE

ASJC Scopus subject areas

  • General Computer Science
  • General Materials Science
  • General Engineering

Fingerprint

Dive into the research topics of 'Emerging SMOTE and GAN Variants for Data Augmentation in Imbalance Machine Learning Tasks: A Review'. Together they form a unique fingerprint.

Cite this