Provable benefits of score matching

by   Chirag Pabbaraju, et al.

Score matching is an alternative to maximum likelihood (ML) for estimating a probability distribution parametrized up to a constant of proportionality. By fitting the ”score” of the distribution, it sidesteps the need to compute this constant of proportionality (which is often intractable). While score matching and variants thereof are popular in practice, precise theoretical understanding of the benefits and tradeoffs with maximum likelihood – both computational and statistical – are not well understood. In this work, we give the first example of a natural exponential family of distributions such that the score matching loss is computationally efficient to optimize, and has a comparable statistical efficiency to ML, while the ML loss is intractable to optimize using a gradient-based method. The family consists of exponentials of polynomials of fixed degree, and our result can be viewed as a continuous analogue of recent developments in the discrete setting. Precisely, we show: (1) Designing a zeroth-order or first-order oracle for optimizing the maximum likelihood loss is NP-hard. (2) Maximum likelihood has a statistical efficiency polynomial in the ambient dimension and the radius of the parameters of the family. (3) Minimizing the score matching loss is both computationally and statistically efficient, with complexity polynomial in the ambient dimension.


page 1

page 2

page 3

page 4


Statistical Efficiency of Score Matching: The View from Isoperimetry

Deep generative models parametrized up to a normalizing constant (e.g. e...

Fit Like You Sample: Sample-Efficient Generalized Score Matching from Fast Mixing Markov Chains

Score matching is an approach to learning probability distributions para...

On Computationally Efficient Learning of Exponential Family Distributions

We consider the classical problem of learning, with arbitrary accuracy, ...

Interpretation and Generalization of Score Matching

Score matching is a recently developed parameter learning method that is...

Denoising Score Matching with Random Fourier Features

The density estimation is one of the core problems in statistics. Despit...

Adaptive exponential power distribution with moving estimator for nonstationary time series

While standard estimation assumes that all datapoints are from probabili...

Incremental maximum likelihood estimation for efficient adaptive filtering

Adaptive filtering is a well-known problem with a wide range of applicat...

Please sign up or login with your details

Forgot password? Click here to reset