TY - GEN

T1 - Tighter guarantees for the compressive multi-layer perceptron

AU - Kaban, Ata

AU - Thummanusarn, Yamonporn

PY - 2018/11/22

Y1 - 2018/11/22

N2 - We are interested in theoretical guarantees for classic 2- layer feed-forward neural networks with sigmoidal activation functions, having inputs linearly compressed by random projection. Due to the speedy increase of the dimensionality of modern data sets, and the development of novel data acquisition devices in compressed sensing, a proper understanding of are the guarantees obtainable is of much practical importance. We start by analysing previous work that attempted to derive a lower bound on the target dimension to ensure low distortion of the outputs under random projection, and we find a disagreement with empirically observed behaviour. We then give a new lower bound on the target dimension that, in contrast with previous work, does not depend on the number of hidden neurons, but only depends on the Frobenius norm of the first layer weights, and in addition it holds for a much larger class of random projections. Numerical experiments agree with our finding. Furthermore, we are able to bound the generalisation error of the compressive network in terms of the error and the expected distortion of the optimal network in the original uncompressed class. These results mean that one can provably learn networks with arbitrarily large number of hidden units from randomly compressed data, as long as there is sufficient regularity in the original learning problem, which our analysis rigorously quantifies.

AB - We are interested in theoretical guarantees for classic 2- layer feed-forward neural networks with sigmoidal activation functions, having inputs linearly compressed by random projection. Due to the speedy increase of the dimensionality of modern data sets, and the development of novel data acquisition devices in compressed sensing, a proper understanding of are the guarantees obtainable is of much practical importance. We start by analysing previous work that attempted to derive a lower bound on the target dimension to ensure low distortion of the outputs under random projection, and we find a disagreement with empirically observed behaviour. We then give a new lower bound on the target dimension that, in contrast with previous work, does not depend on the number of hidden neurons, but only depends on the Frobenius norm of the first layer weights, and in addition it holds for a much larger class of random projections. Numerical experiments agree with our finding. Furthermore, we are able to bound the generalisation error of the compressive network in terms of the error and the expected distortion of the optimal network in the original uncompressed class. These results mean that one can provably learn networks with arbitrarily large number of hidden units from randomly compressed data, as long as there is sufficient regularity in the original learning problem, which our analysis rigorously quantifies.

KW - Error analysis

KW - Random projection

KW - Multi-layer perceptron

U2 - 10.1007/978-3-030-04070-3_30

DO - 10.1007/978-3-030-04070-3_30

M3 - Conference contribution

SN - 978-3-030-04069-7

T3 - Lecture Notes in Computer Science

SP - 388

EP - 400

BT - Theory and Practice of Natural Computing

A2 - Fagan, David

A2 - Martín-Vide, Carlos

A2 - O’Neill, Michael

A2 - A. Vega-Rodríguez, Miguel

PB - Springer

T2 - 7th International Conference on the Theory and Practice of Natural Computing (TPNC 2018)

Y2 - 12 December 2018 through 14 December 2018

ER -