Optimisation and Learning with Randomly Compressed Gradient Updates

Zhanliang Huang*, Yunwen Lei, Ata Kaban

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

223 Downloads (Pure)

Abstract

Gradient descent methods are simple and efficient optimisation algorithms with widespread applications. To handle high-dimensional problems, we study compressed stochastic gradient descent (SGD) with low-dimensional gradient updates. We provide a detailed analysis in terms of both optimisation rates and generalisation rates. To this end, we develop uniform stability bounds for CompSGD for both smooth and non-smooth problems, based on which we develop almost optimal population risk bounds. Then, we extend our analysis to two variants of SGD – batch and mini-batch gradient descent. Furthermore, we show these variants achieve almost optimal rates compared to their high-dimensional gradient setting. Thus, our results provide a way to reduce the dimension of gradient updates without affecting the convergence rate in the generalisation analysis. Moreover, we show that the same result also holds in the differentially private setting, which allows us to reduce the dimension of added noise at “almost free” cost.
Original languageEnglish
Number of pages53
JournalNeural Computation
Early online date15 May 2023
DOIs
Publication statusE-pub ahead of print - 15 May 2023

Keywords

  • Gradient descent
  • Random projection
  • Generalisation bounds
  • Differential privacy

Fingerprint

Dive into the research topics of 'Optimisation and Learning with Randomly Compressed Gradient Updates'. Together they form a unique fingerprint.

Cite this