Segmenting Moving Objects via an Object-Centric Layered Representation

07/05/2022
by   Junyu Xie, et al.
1

The objective of this paper is a model that is able to discover, track and segment multiple moving objects in a video. We make four contributions: First, we introduce an object-centric segmentation model with a depth-ordered layer representation. This is implemented using a variant of the transformer architecture that ingests optical flow, where each query vector specifies an object and its layer for the entire video. The model can effectively discover multiple moving objects and handle mutual occlusions; Second, we introduce a scalable pipeline for generating synthetic training data with multiple objects, significantly reducing the requirements for labour-intensive annotations, and supporting Sim2Real generalisation; Third, we show that the model is able to learn object permanence and temporal shape consistency, and is able to predict amodal segmentation masks; Fourth, we evaluate the model on standard video segmentation benchmarks, DAVIS, MoCA, SegTrack, FBMS-59, and achieve state-of-the-art unsupervised segmentation performance, even outperforming several supervised approaches. With test-time adaptation, we observe further performance boosts.

READ FULL TEXT

page 2

page 8

page 9

page 15

page 17

page 21

page 24

page 25

research
04/15/2021

Self-supervised Video Object Segmentation by Motion Grouping

Animals have evolved highly functional visual systems to understand moti...
research
05/26/2022

Unsupervised Multi-object Segmentation Using Attention and Soft-argmax

We introduce a new architecture for unsupervised object-centric represen...
research
03/14/2023

InstMove: Instance Motion for Object-centric Video Segmentation

Despite significant efforts, cutting-edge video segmentation methods sti...
research
08/01/2022

BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object Segmentation

Video Object Segmentation (VOS) is fundamental to video understanding. T...
research
10/01/2022

Motion-inductive Self-supervised Object Discovery in Videos

In this paper, we consider the task of unsupervised object discovery in ...
research
08/16/2020

Time-Supervised Primary Object Segmentation

We describe an unsupervised method to detect and segment portions of liv...
research
08/15/2023

Helping Hands: An Object-Aware Ego-Centric Video Recognition Model

We introduce an object-aware decoder for improving the performance of sp...

Please sign up or login with your details

Forgot password? Click here to reset