Efficient Marginalization of Discrete and Structured Latent Variables via Sparsity

07/03/2020
by   Gonçalo M. Correia, et al.
0

Training neural network models with discrete (categorical or structured) latent variables can be computationally challenging, due to the need for marginalization over large or combinatorial sets. To circumvent this issue, one typically resorts to sampling-based approximations of the true marginal, requiring noisy gradient estimators (e.g., score function estimator) or continuous relaxations with lower-variance reparameterized gradients (e.g., Gumbel-Softmax). In this paper, we propose a new training strategy which replaces these estimators by an exact yet efficient marginalization. To achieve this, we parameterize discrete distributions over latent assignments using differentiable sparse mappings: sparsemax and its structured counterparts. In effect, the support of these distributions is greatly reduced, which enables efficient marginalization. We report successful results in three tasks covering a range of latent variable modeling applications: a semisupervised deep generative model, a latent communication game, and a generative model with a bit vector latent representation. In all cases, we obtain good performance while still achieving the practicality of sampling-based approximations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/03/2016

Categorical Reparameterization with Gumbel-Softmax

Categorical variables are a natural choice for representing discrete str...
research
10/28/2021

Leveraging Recursive Gumbel-Max Trick for Approximate Inference in Combinatorial Spaces

Structured latent variables allow incorporating meaningful prior knowled...
research
04/03/2023

Learning Sparsity of Representations with Discrete Latent Variables

Deep latent generative models have attracted increasing attention due to...
research
06/15/2020

Gradient Estimation with Stochastic Softmax Tricks

The Gumbel-Max trick is the basis of many relaxed gradient estimators. T...
research
04/17/2023

Bridging Discrete and Backpropagation: Straight-Through and Beyond

Backpropagation, the cornerstone of deep learning, is limited to computi...
research
05/26/2018

Revisiting Reweighted Wake-Sleep

Discrete latent-variable models, while applicable in a variety of settin...
research
08/05/2021

Sparse Communication via Mixed Distributions

Neural networks and other machine learning models compute continuous rep...

Please sign up or login with your details

Forgot password? Click here to reset