Zero-Shot Learning via Latent Space Encoding

12/26/2017
by   Yunlong Yu, et al.
0

Zero-Shot Learning (ZSL) is typically achieved by resorting to a class semantic embedding space to transfer the knowledge from the seen classes to unseen ones. Capturing the common semantic characteristics between the visual modality and the class semantic modality (e.g., attributes or word vector) is a key to the success of ZSL. In this paper, we present a novel approach called Latent Space Encoding (LSE) for ZSL based on an encoder-decoder framework, which learns a highly effective latent space to well reconstruct both the visual space and the semantic embedding space. For each modality, the encoderdecoder framework jointly maximizes the recoverability of the original space from the latent space and the predictability of the latent space from the original space, thus making the latent space feature-aware. To relate the visual and class semantic modalities together, their features referring to the same concept are enforced to share the same latent codings. In this way, the semantic relations of different modalities are generalized with the latent representations. We also show that the proposed encoder-decoder framework is easily extended to more modalities. Extensive experimental results on four benchmark datasets (AwA, CUB, aPY, and ImageNet) clearly demonstrate the superiority of the proposed approach on several ZSL tasks, including traditional ZSL, generalized ZSL, and zero-shot retrieval (ZSR).

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

page 7

page 9

research
03/27/2017

Transductive Zero-Shot Learning with a Self-training dictionary approach

As an important and challenging problem in computer vision, zero-shot le...
research
06/13/2019

Joint Concept Matching-Space Projection Learning for Zero-Shot Recognition

Zero-shot learning (ZSL) has been widely researched and achieved a great...
research
06/26/2021

Generalized Zero-Shot Learning using Multimodal Variational Auto-Encoder with Semantic Concepts

With the ever-increasing amount of data, the central challenge in multim...
research
09/30/2022

Relative representations enable zero-shot latent space communication

Neural networks embed the geometric structure of a data manifold lying i...
research
07/04/2017

Conditional generation of multi-modal data using constrained embedding space mapping

We present a conditional generative model that maps low-dimensional embe...
research
11/04/2016

Semantic Noise Modeling for Better Representation Learning

Latent representation learned from multi-layered neural networks via hie...
research
07/11/2022

LaT: Latent Translation with Cycle-Consistency for Video-Text Retrieval

Video-text retrieval is a class of cross-modal representation learning p...

Please sign up or login with your details

Forgot password? Click here to reset