Projects per year
Abstract
Class imbalance is a pervasive problem in machine learning, leading to poor performance in the minority class that is inadequately represented. Federated learning, which trains a shared model collaboratively among multiple clients with their data locally for privacy protection, is also susceptible to class imbalance. The distributed structure and privacy rules in federated learning introduce extra complexities to the challenge of isolated, small, and highly skewed datasets. While sampling and ensemble learning are state-of-the-art techniques for mitigating class imbalance from the data and algorithm perspectives, they face limitations in the context of federated learning. To address this challenge, we propose a novel oversampling algorithm called "Triplets" that generates synthetic samples for both minority and majority classes based on their shared classification boundary. The proposed algorithm captures new minority samples by leveraging three triplets around the boundary, where two come from the majority class and one from the minority class. This approach offers several advantages over existing oversampling techniques on federated datasets. We evaluate the effectiveness of our proposed algorithm through extensive experiments using various real-world datasets and different models in both centralized and federated learning environments. Our results demonstrate the effectiveness of our proposed algorithm, which outperforms existing oversampling techniques. In conclusion, our proposed algorithm offers a promising solution to the class imbalance problem in federated learning. The source code is released at github.com/Xiao-Chenguang/Triplets-Oversampling.
Original language | English |
---|---|
Title of host publication | Machine Learning and Knowledge Discovery in Databases: Research Track |
Subtitle of host publication | European Conference, ECML PKDD 2023, Turin, Italy, September 18–22, 2023, Proceedings, Part II |
Editors | Danai Koutra, Claudia Plant, Manuel Gomez Rodriguez, Elena Baralis, Francesco Bonchi |
Publisher | Springer |
Pages | 368–383 |
Number of pages | 16 |
Edition | 1 |
ISBN (Electronic) | 9783031434150 |
ISBN (Print) | 9783031434143 |
Publication status | Published - 17 Sept 2023 |
Event | European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases - Turin, Italy Duration: 18 Sept 2023 → 22 Sept 2023 |
Publication series
Name | Lecture Notes in Computer Science |
---|---|
Publisher | Springer |
Volume | 14170 |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases |
---|---|
Abbreviated title | ECML PKDD 2023 |
Country/Territory | Italy |
City | Turin |
Period | 18/09/23 → 22/09/23 |
Fingerprint
Dive into the research topics of 'Triplets Oversampling for Class Imbalanced Federated Datasets'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Detecting and Anticipating Data Non-stationarity in Distributed Machine Learning
1/10/22 → 30/09/23
Project: Research