Revisiting Reweighted Wake-Sleep

05/26/2018
by   Tuan Anh Le, et al.
0

Discrete latent-variable models, while applicable in a variety of settings, can often be difficult to learn. Sampling discrete latent variables can result in high-variance gradient estimators for two primary reasons: 1. branching on the samples within the model, and 2. the lack of a pathwise derivative for the samples. While current state-of-the-art methods employ control-variate schemes for the former and continuous-relaxation methods for the latter, their utility is limited by the complexities of implementing and training effective control-variate schemes and the necessity of evaluating (potentially exponentially) many branch paths in the model. Here, we revisit the reweighted wake-sleep (RWS) (Bornschein and Bengio, 2015) algorithm, and through extensive evaluations, show that it circumvents both these issues, outperforming current state-of-the-art methods in learning discrete latent-variable models. Moreover, we observe that, unlike the importance weighted autoencoder, RWS learns better models and inference networks with increasing numbers of particles, and that its benefits extend to continuous latent-variable models as well. Our results suggest that RWS is a competitive, often preferable, alternative for learning deep generative models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/28/2018

Flexible and accurate inference and learning for deep generative models

We introduce a new approach to learning in hierarchical latent-variable ...
research
08/28/2023

NAS-X: Neural Adaptive Smoothing via Twisting

We present Neural Adaptive Smoothing via Twisting (NAS-X), a method for ...
research
03/21/2017

REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models

Learning in models with discrete latent variables is challenging due to ...
research
06/12/2018

Gaussian mixture models with Wasserstein distance

Generative models with both discrete and continuous latent variables are...
research
08/12/2022

Gradient Estimation for Binary Latent Variables via Gradient Variance Clipping

Gradient estimation is often necessary for fitting generative models wit...
research
07/03/2020

Efficient Marginalization of Discrete and Structured Latent Variables via Sparsity

Training neural network models with discrete (categorical or structured)...
research
10/28/2021

Leveraging Recursive Gumbel-Max Trick for Approximate Inference in Combinatorial Spaces

Structured latent variables allow incorporating meaningful prior knowled...

Please sign up or login with your details

Forgot password? Click here to reset