Improved automated cash optimization with Tree Parzen Estimators for class imbalance problems

Duc Anh Nguyen, Jiawen Kong, Hao Wang, Stefan Menzel, Bernhard Sendhoff, Anna V. Kononova, Thomas Bäck

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The imbalanced classification problem is very relevant in both academic and industrial applications. The task of finding the best machine learning model to use for a specific imbalanced dataset is complicated due to a large number of existing algorithms, each with its own hyperparameters. The Combined Algorithm Selection and Hyperparameter optimization (CASH) has been introduced to tackle both aspects at the same time. However, CASH has not been studied in detail in the class imbalance domain, where the best combination of resampling technique and classification algorithm is searched for, together with their optimized hyperparameters. Thus, we target the CASH problem for imbalanced classification. We experiment with a search space of 5 classification algorithms, 21 resampling approaches and 64 relevant hyperparameters in total. Moreover, we investigate performance of 2 well-known optimization approaches: Random search and Tree Parzen Estimators approach which is a kind of Bayesian optimization. For comparison, we also perform grid search on all combinations of resampling techniques and classification algorithms with their default hyperparameters. Our experimental results show that a Bayesian optimization approach outperforms the other approaches for CASH in this application domain.
Original languageEnglish
Title of host publication2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA)
PublisherIEEE
Pages1-9
Number of pages9
ISBN (Electronic)9781665420990
ISBN (Print)9781665421003
DOIs
Publication statusPublished - 20 Oct 2021
Externally publishedYes
Event2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA) - Porto, Portugal
Duration: 6 Oct 20219 Oct 2021

Publication series

NameInternational Conference on Data Science and Advanced Analytics (DSAA)
PublisherIEEE
ISSN (Print)2472-1573
ISSN (Electronic)2766-4112

Conference

Conference2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA)
Period6/10/219/10/21

Bibliographical note

This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement number 766186 (ECOLE).

Keywords

  • Machine learning algorithms
  • Conferences
  • Machine learning
  • Data science
  • Classification algorithms
  • Bayes methods
  • Task analysis

Fingerprint

Dive into the research topics of 'Improved automated cash optimization with Tree Parzen Estimators for class imbalance problems'. Together they form a unique fingerprint.

Cite this