Improved bounds on the dot product under random projection and random sign projection

Research output: Chapter in Book/Report/Conference proceedingConference contribution

12 Citations (Scopus)

Abstract

Dot product is a key building block in a number of data mining algorithms from classification, regression, correlation clustering, to information retrieval and many others. When data is high dimensional, the use of random projections may serve as a universal dimensionality reduction method that provides both low distortion guarantees and computational savings. Yet, contrary to the optimal guarantees that are known on the preservation of the Euclidean distance cf. the Johnson-Lindenstrauss lemma, the existing guarantees on the dot product under random projection are loose and incomplete in the current data mining and machine learning literature. Some recent literature even suggested that the dot product may not be preserved when the angle between the original vectors is obtuse.

In this paper we provide improved bounds on the dot product under random projection that matches the optimal bounds on the Euclidean distance. As a corollary, we elucidate the impact of the angle between the original vectors on the relative distortion of the dot product under random projection, and we show that the obtuse vs. acute angles behave symmetrically in the same way. In a further corollary we make a link to sign random projection, where we generalise earlier results. Numerical simulations confirm our theoretical results. Finally we give an application of our results to bounding the generalisation error of compressive linear classifiers under the margin loss.
Original languageEnglish
Title of host publicationKDD '15 Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Place of PublicationNew York
PublisherAssociation for Computing Machinery (ACM)
Pages487-496
Number of pages10
ISBN (Print)978-1-4503-3664-2
DOIs
Publication statusPublished - 10 Aug 2015
EventProceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - Sydney, Australia, United Kingdom
Duration: 10 Aug 201513 Aug 2015

Conference

ConferenceProceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Country/TerritoryUnited Kingdom
CitySydney, Australia
Period10/08/1513/08/15

Fingerprint

Dive into the research topics of 'Improved bounds on the dot product under random projection and random sign projection'. Together they form a unique fingerprint.

Cite this