Self-Supervised Video Object Segmentation via Cutout Prediction and Tagging

04/22/2022
by   jyoti-kini, et al.
0

We propose a novel self-supervised Video Object Segmentation (VOS) approach that strives to achieve better object-background discriminability for accurate object segmentation. Distinct from previous self-supervised VOS methods, our approach is based on a discriminative learning loss formulation that takes into account both object and background information to ensure object-background discriminability, rather than using only object appearance. The discriminative learning loss comprises cutout-based reconstruction (cutout region represents part of a frame, whose pixels are replaced with some constant values) and tag prediction loss terms. The cutout-based reconstruction term utilizes a simple cutout scheme to learn the pixel-wise correspondence between the current and previous frames in order to reconstruct the original current frame with added cutout region in it. The introduced cutout patch guides the model to focus as much on the significant features of the object of interest as the less significant ones, thereby implicitly equipping the model to address occlusion-based scenarios. Next, the tag prediction term encourages object-background separability by grouping tags of all pixels in the cutout region that are similar, while separating them from the tags of the rest of the reconstructed frame pixels. Additionally, we introduce a zoom-in scheme that addresses the problem of small object segmentation by capturing fine structural information at multiple scales. Our proposed approach, termed CT-VOS, achieves state-of-the-art results on two challenging benchmarks: DAVIS-2017 and Youtube-VOS. A detailed ablation showcases the importance of the proposed loss formulation to effectively capture object-background discriminability and the impact of our zoom-in scheme to accurately segment small-sized objects.

READ FULL TEXT

page 6

page 7

page 8

research
03/17/2023

Unified Mask Embedding and Correspondence Learning for Self-Supervised Video Segmentation

The objective of this paper is self-supervised learning of video object ...
research
07/18/2019

Self-supervised Training of Proposal-based Segmentation via Background Prediction

While supervised object detection methods achieve impressive accuracy, t...
research
11/11/2020

Self-supervised Segmentation via Background Inpainting

While supervised object detection and segmentation methods achieve impre...
research
04/18/2019

Discriminative Online Learning for Fast Video Object Segmentation

We address the highly challenging problem of video object segmentation. ...
research
04/15/2021

Self-supervised Video Object Segmentation by Motion Grouping

Animals have evolved highly functional visual systems to understand moti...
research
09/29/2019

RPM-Net: Robust Pixel-Level Matching Networks for Self-Supervised Video Object Segmentation

In this paper, we introduce a self-supervised approach for video object ...
research
09/25/2021

Contrastive Learning for Mitochondria Segmentation

Mitochondria segmentation in electron microscopy images is essential in ...

Please sign up or login with your details

Forgot password? Click here to reset