Triplets Oversampling for Class Imbalanced Federated Datasets

Chenguang Xiao, Shuo Wang*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Downloads (Pure)

Abstract

Class imbalance is a pervasive problem in machine learning, leading to poor performance in the minority class that is inadequately represented. Federated learning, which trains a shared model collaboratively among multiple clients with their data locally for privacy protection, is also susceptible to class imbalance. The distributed structure and privacy rules in federated learning introduce extra complexities to the challenge of isolated, small, and highly skewed datasets. While sampling and ensemble learning are state-of-the-art techniques for mitigating class imbalance from the data and algorithm perspectives, they face limitations in the context of federated learning. To address this challenge, we propose a novel oversampling algorithm called "Triplets" that generates synthetic samples for both minority and majority classes based on their shared classification boundary. The proposed algorithm captures new minority samples by leveraging three triplets around the boundary, where two come from the majority class and one from the minority class. This approach offers several advantages over existing oversampling techniques on federated datasets. We evaluate the effectiveness of our proposed algorithm through extensive experiments using various real-world datasets and different models in both centralized and federated learning environments. Our results demonstrate the effectiveness of our proposed algorithm, which outperforms existing oversampling techniques. In conclusion, our proposed algorithm offers a promising solution to the class imbalance problem in federated learning. The source code is released at github.com/Xiao-Chenguang/Triplets-Oversampling.
Original languageEnglish
Title of host publicationMachine Learning and Knowledge Discovery in Databases: Research Track
Subtitle of host publicationEuropean Conference, ECML PKDD 2023, Turin, Italy, September 18–22, 2023, Proceedings, Part II
EditorsDanai Koutra, Claudia Plant, Manuel Gomez Rodriguez, Elena Baralis, Francesco Bonchi
PublisherSpringer
Pages368–383
Number of pages16
Edition1
ISBN (Electronic)9783031434150
ISBN (Print)9783031434143
Publication statusPublished - 17 Sept 2023
EventEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases - Turin, Italy
Duration: 18 Sept 202322 Sept 2023

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
Volume14170
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases
Abbreviated titleECML PKDD 2023
Country/TerritoryItaly
CityTurin
Period18/09/2322/09/23

Fingerprint

Dive into the research topics of 'Triplets Oversampling for Class Imbalanced Federated Datasets'. Together they form a unique fingerprint.

Cite this