Optical Flow boosts Unsupervised Localization and Segmentation

07/25/2023
by   Xinyu Zhang, et al.
0

Unsupervised localization and segmentation are long-standing robot vision challenges that describe the critical ability for an autonomous robot to learn to decompose images into individual objects without labeled data. These tasks are important because of the limited availability of dense image manual annotation and the promising vision of adapting to an evolving set of object categories in lifelong learning. Most recent methods focus on using visual appearance continuity as object cues by spatially clustering features obtained from self-supervised vision transformers (ViT). In this work, we leverage motion cues, inspired by the common fate principle that pixels that share similar movements tend to belong to the same object. We propose a new loss term formulation that uses optical flow in unlabeled videos to encourage self-supervised ViT features to become closer to each other if their corresponding spatial locations share similar movements, and vice versa. We use the proposed loss function to finetune vision transformers that were originally trained on static images. Our fine-tuning procedure outperforms state-of-the-art techniques for unsupervised semantic segmentation through linear probing, without the use of any labeled data. This procedure also demonstrates increased performance over original ViT networks across unsupervised object localization and semantic segmentation benchmarks.

READ FULL TEXT

page 1

page 3

page 5

page 6

research
07/24/2023

MC-JEPA: A Joint-Embedding Predictive Architecture for Self-Supervised Learning of Motion and Content Features

Self-supervised learning of visual representations has been focusing on ...
research
05/16/2022

Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised Semantic Segmentation and Localization

Unsupervised localization and segmentation are long-standing computer vi...
research
04/27/2022

Self-Supervised Learning of Object Parts for Semantic Segmentation

Progress in self-supervised learning has brought strong general image re...
research
07/15/2018

Cross Pixel Optical Flow Similarity for Self-Supervised Learning

We propose a novel method for learning convolutional neural image repres...
research
04/15/2021

Self-supervised Video Object Segmentation by Motion Grouping

Animals have evolved highly functional visual systems to understand moti...
research
04/17/2023

Bootstrapping Objectness from Videos by Relaxed Common Fate and Visual Grouping

We study learning object segmentation from unlabeled videos. Humans can ...
research
11/30/2020

Unsupervised Optical Flow Using Cost Function Unrolling

Analyzing motion between two consecutive images is one of the fundamenta...

Please sign up or login with your details

Forgot password? Click here to reset