Eval all, trust a few, do wrong to none: Comparing sentence generation models
In this paper, we study recent neural generative models for text generation related to variational autoencoders. These models employ various techniques to match the posterior and prior distributions, which is important to ensure a high sample quality and a low reconstruction error. In our study, we follow a rigorous evaluation protocol using a large set of previously used and novel automatic metrics and human evaluation of both generated samples and reconstructions. We hope that it will become the new evaluation standard when comparing neural generative models for text.
READ FULL TEXT