Latent Normalizing Flows for Many-to-Many Cross-Domain Mappings

02/16/2020
by   Shweta Mahajan, et al.
7

Learned joint representations of images and text form the backbone of several important cross-domain tasks such as image captioning. Prior work mostly maps both domains into a common latent representation in a purely supervised fashion. This is rather restrictive, however, as the two domains follow distinct generative processes. Therefore, we propose a novel semi-supervised framework, which models shared information between domains and domain-specific information separately. The information shared between the domains is aligned with an invertible neural network. Our model integrates normalizing flow-based priors for the domain-specific information, which allows us to learn diverse many-to-many mappings between the two domains. We demonstrate the effectiveness of our model on diverse tasks, including image captioning and text-to-image synthesis.

READ FULL TEXT

page 7

page 8

page 13

page 14

page 15

research
11/02/2020

Diverse Image Captioning with Context-Object Split Latent Spaces

Diverse image captioning models aim to learn one-to-many mappings that a...
research
08/01/2019

Cross-domain Network Representations

The purpose of network representation is to learn a set of latent featur...
research
05/16/2020

Learning Joint Articulatory-Acoustic Representations with Normalizing Flows

The articulatory geometric configurations of the vocal tract and the aco...
research
10/26/2022

FaD-VLP: Fashion Vision-and-Language Pre-training towards Unified Retrieval and Captioning

Multimodal tasks in the fashion domain have significant potential for e-...
research
08/07/2023

Cooperative Colorization: Exploring Latent Cross-Domain Priors for NIR Image Spectrum Translation

Near-infrared (NIR) image spectrum translation is a challenging problem ...
research
06/27/2012

Cross-Domain Multitask Learning with Latent Probit Models

Learning multiple tasks across heterogeneous domains is a challenging pr...
research
08/05/2023

Improving Generalization of Image Captioning with Unsupervised Prompt Learning

Pretrained visual-language models have demonstrated impressive zero-shot...

Please sign up or login with your details

Forgot password? Click here to reset