Stability and generalization of stochastic gradient methods for minimax problems

Yunwen Lei, Zhenhuan Yang, Tianbao Yang, Yiming Ying

Research output: Chapter in Book/Report/Conference proceedingConference contribution

59 Downloads (Pure)

Abstract

Many machine learning problems can be formulated as minimax problems such as Generative Adversarial Networks (GANs), AUC maximization and robust estimation, to mention but a few. A substantial amount of studies are devoted to studying the convergence behavior of their stochastic gradient-type algorithms. In contrast, there is relatively little work on their generalization, i.e., how the learning models built from training examples would behave on test examples. In this paper, we provide a comprehensive generalization analysis of stochastic gradient methods for minimax problems under both convex-concave and nonconvex-nonconcave cases through the lens of algorithmic stability. We establish a quantitative connection between stability and several generalization measures both in expectation and with high probability. For the convex-concave setting, our stability analysis shows that stochastic gradient descent ascent attains optimal generalization bounds for both smooth and nonsmooth minimax problems. We also establish generalization bounds for both weakly-convex-weakly-concave and gradient-dominated problems.
Original languageEnglish
Title of host publicationProceedings of ICML 2021
EditorsMarina Meila, Tong Zhang
PublisherJMLR
Pages6175-6186
Number of pages12
Publication statusPublished - 18 Jul 2021
EventThe Thirty-eighth International Conference on Machine Learning - Virtual
Duration: 18 Jul 202124 Jul 2021
https://icml.cc/

Publication series

NameProceedings of Machine Learning Research
Volume139
ISSN (Electronic)2640-3498

Conference

ConferenceThe Thirty-eighth International Conference on Machine Learning
Abbreviated titleICML 2021
Period18/07/2124/07/21
Internet address

Fingerprint

Dive into the research topics of 'Stability and generalization of stochastic gradient methods for minimax problems'. Together they form a unique fingerprint.

Cite this