Oops I Took A Gradient: Scalable Sampling for Discrete Distributions

02/08/2021
by   Will Grathwohl, et al.
2

We propose a general and scalable approximate sampling strategy for probabilistic models with discrete variables. Our approach uses gradients of the likelihood function with respect to its discrete inputs to propose updates in a Metropolis-Hastings sampler. We show empirically that this approach outperforms generic samplers in a number of difficult settings including Ising models, Potts models, restricted Boltzmann machines, and factorial hidden Markov models. We also demonstrate the use of our improved sampler for training deep energy-based models on high dimensional discrete data. This approach outperforms variational auto-encoders and existing energy-based models. Finally, we give bounds showing that our approach is near-optimal in the class of samplers which propose local updates.

READ FULL TEXT

page 5

page 8

page 16

page 18

page 19

page 20

page 21

research
06/20/2022

A Langevin-like Sampler for Discrete Distributions

We propose discrete Langevin proposal (DLP), a simple and scalable gradi...
research
11/10/2020

Learning Discrete Energy-based Models via Auxiliary-variable Local Exploration

Discrete structures play an important role in applications like program ...
research
08/20/2017

Boltzmann machines and energy-based models

We review Boltzmann machines and energy-based models. A Boltzmann machin...
research
01/23/2023

Explaining the effects of non-convergent sampling in the training of Energy-Based Models

In this paper, we quantify the impact of using non-convergent Markov cha...
research
06/02/2011

Restricted Collapsed Draw: Accurate Sampling for Hierarchical Chinese Restaurant Process Hidden Markov Models

We propose a restricted collapsed draw (RCD) sampler, a general Markov c...
research
06/29/2022

Discrete Langevin Sampler via Wasserstein Gradient Flow

Recently, a family of locally balanced (LB) samplers has demonstrated ex...
research
01/14/2019

High-dimensional structure learning of binary pairwise Markov networks: A comparative numerical study

Learning the undirected graph structure of a Markov network from data is...

Please sign up or login with your details

Forgot password? Click here to reset