Cyclical Annealing Schedule: A Simple Approach to Mitigating KL Vanishing

03/25/2019
by   Hao Fu, et al.
22

Variational autoencoders (VAEs) with an auto-regressive decoder have been applied for many natural language processing (NLP) tasks. The VAE objective consists of two terms, (i) reconstruction and (ii) KL regularization, balanced by a weighting hyper-parameter β. One notorious training difficulty is that the KL term tends to vanish. In this paper we study scheduling schemes for β, and show that KL vanishing is caused by the lack of good latent codes in training the decoder at the beginning of optimization. To remedy this, we propose a cyclical annealing schedule, which repeats the process of increasing β multiple times. This new procedure allows the progressive learning of more meaningful latent codes, by leveraging the informative representations of previous cycles as warm re-starts. The effectiveness of cyclical annealing is validated on a broad range of NLP tasks, including language modeling, dialog response generation and unsupervised language pre-training.

READ FULL TEXT

page 6

page 14

page 15

research
06/22/2018

Probabilistic Natural Language Generation with Wasserstein Autoencoders

Probabilistic generation of natural language sentences is an important t...
research
07/28/2020

Novel Potential Inhibitors Against SARS-CoV-2 Using Artificial Intelligence

Abstract Since known approved drugs like liponavir and ritonavir failed ...
research
04/22/2020

Discretized Bottleneck in VAE: Posterior-Collapse-Free Sequence-to-Sequence Learning

Variational autoencoders (VAEs) are important tools in end-to-end repres...
research
08/31/2018

Spherical Latent Spaces for Stable Variational Autoencoders

A hallmark of variational autoencoders (VAEs) for text processing is the...
research
07/13/2022

Fuse It More Deeply! A Variational Transformer with Layer-Wise Latent Variable Inference for Text Generation

The past several years have witnessed Variational Auto-Encoder's superio...
research
03/26/2019

Improve Diverse Text Generation by Self Labeling Conditional Variational Auto Encoder

Diversity plays a vital role in many text generating applications. In re...
research
10/01/2018

Taming VAEs

In spite of remarkable progress in deep latent variable generative model...

Please sign up or login with your details

Forgot password? Click here to reset