Structure discovery in PAC-Learning by Random Projections

Ata Kaban*, Henry Reeve

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

17 Downloads (Pure)

Abstract

High dimensional learning is data-hungry in general; however, many natural data sources and real-world learning problems posses some hidden low-complexity structure that permit effective learning from relatively small sample sizes. We are interested in the general question of how to discover and exploit such hidden benign traits when problem-specific prior knowledge is insufficient. In this work, we address this question through random projection’s ability to expose structure. We study both compressive learning and high dimensional learning from this angle by introducing the notions of compressive distortion and compressive complexity. We give user-friendly PAC bounds in the agnostic setting that are formulated in terms of these quantities, and we show that our bounds can be tight when these quantities are small. We then instantiate these quantities in several examples of particular learning problems, demonstrating their ability to discover interpretable structural characteristics that make high dimensional instances of these problems solvable to good approximation in a random linear subspace. In the examples considered, these turn out to resemble some familiar benign traits such as the margin, the margin distribution, the intrinsic dimension, the spectral decay of the data covariance, or the norms of parameters—while our general notions of compressive distortion and compressive complexity serve to unify these, and may be used to discover benign structural traits for other PAC-learnable problems.
Original languageEnglish
JournalMachine Learning
Early online date26 Mar 2024
DOIs
Publication statusE-pub ahead of print - 26 Mar 2024

Bibliographical note

Funding:
This work was funded by EPSRC Fellowship EP/P004245/1.

Keywords

  • Random Projection
  • Generalization
  • Structure Discovery

Cite this