Neural Stein critics with staged L^2-regularization

07/07/2022
by   Matthew Repasky, et al.
0

Learning to differentiate model distributions from observed data is a fundamental problem in statistics and machine learning, and high-dimensional data remains a challenging setting for such problems. Metrics that quantify the disparity in probability distributions, such as the Stein discrepancy, play an important role in statistical testing in high dimensions. In this paper, we consider the setting where one wishes to distinguish between data sampled from an unknown probability distribution and a nominal model distribution. While recent studies revealed that the optimal L^2-regularized Stein critic equals the difference of the score functions of two probability distributions up to a multiplicative constant, we investigate the role of L^2 regularization when training a neural network Stein discrepancy critic function. Motivated by the Neural Tangent Kernel theory of training neural networks, we develop a novel staging procedure for the weight of regularization over training time. This leverages the advantages of highly-regularized training at early times while also empirically delaying overfitting. Theoretically, we relate the training dynamic with large regularization weight to the kernel regression optimization of "lazy training" regime in early training times. The benefit of the staged L^2 regularization is demonstrated on simulated high dimensional distribution drift data and an application to evaluating generative models of image data.

READ FULL TEXT
research
05/30/2018

Regularized Kernel and Neural Sobolev Descent: Dynamic MMD Transport

We introduce Regularized Kernel and Neural Sobolev Descent for transport...
research
11/24/2017

Central limit theorems for Sinkhorn divergence between probability distributions on finite spaces and statistical applications

The notion of Sinkhorn divergence has recently gained popularity in mach...
research
05/02/2013

Testing Hypotheses by Regularized Maximum Mean Discrepancy

Do two data samples come from different distributions? Recent studies of...
research
11/14/2017

Sobolev GAN

We propose a new Integral Probability Metric (IPM) between distributions...
research
01/26/2019

Witnessing Adversarial Training in Reproducing Kernel Hilbert Spaces

Modern implicit generative models such as generative adversarial network...
research
02/28/2020

Generalized Sliced Distances for Probability Distributions

Probability metrics have become an indispensable part of modern statisti...
research
05/20/2020

Inverse Estimation of Elastic Modulus Using Physics-Informed Generative Adversarial Networks

While standard generative adversarial networks (GANs) rely solely on tra...

Please sign up or login with your details

Forgot password? Click here to reset