Generative Hierarchical Models for Parts, Objects, and Scenes

10/21/2019
by   Fei Deng, et al.
9

Compositional structures between parts and objects are inherent in natural scenes. Modeling such compositional hierarchies via unsupervised learning can bring various benefits such as interpretability and transferability, which are important in many downstream tasks. In this paper, we propose the first deep latent variable model, called RICH, for learning Representation of Interpretable Compositional Hierarchies. At the core of RICH is a latent scene graph representation that organizes the entities of a scene into a tree structure according to their compositional relationships. During inference, taking top-down approach, RICH is able to use higher-level representation to guide lower-level decomposition. This avoids the difficult problem of routing between parts and objects that is faced by bottom-up approaches. In experiments on images containing multiple objects with different part compositions, we demonstrate that RICH is able to learn the latent compositional hierarchy and generate imaginary scenes.

READ FULL TEXT

page 6

page 7

page 8

page 9

research
12/07/2021

Unsupervised Learning of Compositional Scene Representations from Multiple Unspecified Viewpoints

Visual scenes are extremely rich in diversity, not only because there ar...
research
01/22/2017

Greedy Compositional Clustering for Unsupervised Learning of Hierarchical Compositional Models

This paper proposes to integrate a feature pursuit learning process into...
research
01/16/2013

Complexity of Representation and Inference in Compositional Models with Part Sharing

This paper describes serial and parallel compositional models of multipl...
research
08/14/2018

Imagining the Unseen: Learning a Distribution over Incomplete Images with Dense Latent Trees

Images are composed as a hierarchy of object parts. We use this insight ...
research
11/17/2021

Three approaches to supervised learning for compositional data with pairwise logratios

The common approach to compositional data analysis is to transform the d...
research
06/22/2020

Learning Physical Graph Representations from Visual Scenes

Convolutional Neural Networks (CNNs) have proved exceptional at learning...
research
11/26/2018

Unsupervised learning with sparse space-and-time autoencoders

We use spatially-sparse two, three and four dimensional convolutional au...

Please sign up or login with your details

Forgot password? Click here to reset