Leveraging Recursive Gumbel-Max Trick for Approximate Inference in Combinatorial Spaces

10/28/2021
by   Kirill Struminsky, et al.
0

Structured latent variables allow incorporating meaningful prior knowledge into deep learning models. However, learning with such variables remains challenging because of their discrete nature. Nowadays, the standard learning approach is to define a latent variable as a perturbed algorithm output and to use a differentiable surrogate for training. In general, the surrogate puts additional constraints on the model and inevitably leads to biased gradients. To alleviate these shortcomings, we extend the Gumbel-Max trick to define distributions over structured domains. We avoid the differentiable surrogates by leveraging the score function estimators for optimization. In particular, we highlight a family of recursive algorithms with a common feature we call stochastic invariant. The feature allows us to construct reliable gradient estimates and control variates without additional constraints on the model. In our experiments, we consider various structured latent variable models and achieve results competitive with relaxation-based counterparts.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/03/2020

Efficient Marginalization of Discrete and Structured Latent Variables via Sparsity

Training neural network models with discrete (categorical or structured)...
research
11/09/2021

Double Control Variates for Gradient Estimation in Discrete Latent Variable Models

Stochastic gradient-based optimisation for discrete latent variable mode...
research
11/22/2019

Low-variance Black-box Gradient Estimates for the Plackett-Luce Distribution

Learning models with discrete latent variables using stochastic gradient...
research
03/21/2017

REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models

Learning in models with discrete latent variables is challenging due to ...
research
06/15/2020

Gradient Estimation with Stochastic Softmax Tricks

The Gumbel-Max trick is the basis of many relaxed gradient estimators. T...
research
05/26/2018

Revisiting Reweighted Wake-Sleep

Discrete latent-variable models, while applicable in a variety of settin...
research
10/05/2020

Understanding the Mechanics of SPIGOT: Surrogate Gradients for Latent Structure Learning

Latent structure models are a powerful tool for modeling language data: ...

Please sign up or login with your details

Forgot password? Click here to reset