Relaxed Wasserstein with Applications to GANs
We propose a novel class of statistical divergences called Relaxed Wasserstein (RW) divergence. RW divergence generalizes Wasserstein distance and is parametrized by strictly convex, differentiable functions. We establish for RW several key probabilistic properties, which are critical for the success of Wasserstein distances. In particular, we show that RW is dominated by Total Variation (TV) and Wasserstein-L^2 distance, and establish continuity, differentiability, and duality representation of RW divergence. Finally, we provide a non-asymptotic moment estimate and a concentration inequality for RW divergence. Our experiments on image generation problems show that RWGANs with Kullback-Leibler (KL) divergence provide competitive performance compared with many state-of-the-art approaches. Empirically, we show that RWGANs possess better convergence properties than WGANs, with competitive inception scores. In comparison to the existing literature in GANs, which are ad-hoc in the choices of cost functions, this new conceptual framework not only provides great flexibility in designing general cost functions, e.g., for applications to GANs, but also allows different cost functions implemented and compared under a unified mathematical framework.
READ FULL TEXT