Improving Pixel-Level Contrastive Learning by Leveraging Exogenous Depth Information

11/18/2022
by   Ahmed Ben Saad, et al.
0

Self-supervised representation learning based on Contrastive Learning (CL) has been the subject of much attention in recent years. This is due to the excellent results obtained on a variety of subsequent tasks (in particular classification), without requiring a large amount of labeled samples. However, most reference CL algorithms (such as SimCLR and MoCo, but also BYOL and Barlow Twins) are not adapted to pixel-level downstream tasks. One existing solution known as PixPro proposes a pixel-level approach that is based on filtering of pairs of positive/negative image crops of the same image using the distance between the crops in the whole image. We argue that this idea can be further enhanced by incorporating semantic information provided by exogenous data as an additional selection filter, which can be used (at training time) to improve the selection of the pixel-level positive/negative samples. In this paper we will focus on the depth information, which can be obtained by using a depth estimation network or measured from available data (stereovision, parallax motion, LiDAR, etc.). Scene depth can provide meaningful cues to distinguish pixels belonging to different objects based on their depth. We show that using this exogenous information in the contrastive loss leads to improved results and that the learned representations better follow the shapes of objects. In addition, we introduce a multi-scale loss that alleviates the issue of finding the training parameters adapted to different object sizes. We demonstrate the effectiveness of our ideas on the Breakout Segmentation on Borehole Images where we achieve an improvement of 1.9% over PixPro and nearly 5% over the supervised baseline. We further validate our technique on the indoor scene segmentation tasks with ScanNet and outdoor scenes with CityScapes ( 1.6% and 1.1% improvement over PixPro respectively).

READ FULL TEXT

page 6

page 7

page 8

research
11/21/2021

HoughCL: Finding Better Positive Pairs in Dense Self-supervised Learning

Recently, self-supervised methods show remarkable achievements in image-...
research
11/19/2020

Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning

Contrastive learning methods for unsupervised visual representation lear...
research
05/30/2022

Self-Supervised Pre-training of Vision Transformers for Dense Prediction Tasks

We present a new self-supervised pre-training of Vision Transformers for...
research
03/22/2022

CP2: Copy-Paste Contrastive Pretraining for Semantic Segmentation

Recent advances in self-supervised contrastive learning yield good image...
research
06/28/2023

GraSS: Contrastive Learning with Gradient Guided Sampling Strategy for Remote Sensing Image Semantic Segmentation

Self-supervised contrastive learning (SSCL) has achieved significant mil...
research
01/27/2023

Leveraging the Third Dimension in Contrastive Learning

Self-Supervised Learning (SSL) methods operate on unlabeled data to lear...
research
07/08/2020

A Multi-Level Approach to Waste Object Segmentation

We address the problem of localizing waste objects from a color image an...

Please sign up or login with your details

Forgot password? Click here to reset