Locality-Sensitive Hashing for f-Divergences: Mutual Information Loss and Beyond

10/28/2019
by   Lin Chen, et al.
0

Computing approximate nearest neighbors in high dimensional spaces is a central problem in large-scale data mining with a wide range of applications in machine learning and data science. A popular and effective technique in computing nearest neighbors approximately is the locality-sensitive hashing (LSH) scheme. In this paper, we aim to develop LSH schemes for distance functions that measure the distance between two probability distributions, particularly for f-divergences as well as a generalization to capture mutual information loss. First, we provide a general framework to design LHS schemes for f-divergence distance functions and develop LSH schemes for the generalized Jensen-Shannon divergence and triangular discrimination in this framework. We show a two-sided approximation result for approximation of the generalized Jensen-Shannon divergence by the Hellinger distance, which may be of independent interest. Next, we show a general method of reducing the problem of designing an LSH scheme for a Krein kernel (which can be expressed as the difference of two positive definite kernels) to the problem of maximum inner product search. We exemplify this method by applying it to the mutual information loss, due to its several important applications such as model compression.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/10/2020

Locality-sensitive hashing in function spaces

We discuss the problem of performing similarity search over function spa...
research
01/14/2020

Robust Generalization via α-Mutual Information

The aim of this work is to provide bounds connecting two probability mea...
research
02/02/2022

Investigation of Alternative Measures for Mutual Information

Mutual information I(X;Y) is a useful definition in information theory t...
research
01/12/2021

Locality Sensitive Hashing for Efficient Similar Polygon Retrieval

Locality Sensitive Hashing (LSH) is an effective method of indexing a se...
research
09/13/2022

Rényi Divergence Deep Mutual Learning

This paper revisits an incredibly simple yet exceedingly effective compu...
research
04/05/2022

Practical Bounds of Kullback-Leibler Divergence Using Maximum Mean Discrepancy

Estimating Kullback Leibler (KL) divergence from data samples is a stren...
research
01/28/2019

The CM Algorithm for the Maximum Mutual Information Classifications of Unseen Instances

The Maximum Mutual Information (MMI) criterion is different from the Lea...

Please sign up or login with your details

Forgot password? Click here to reset