CoDo: Contrastive Learning with Downstream Background Invariance for Detection

05/10/2022
by   Bing Zhao, et al.
7

The prior self-supervised learning researches mainly select image-level instance discrimination as pretext task. It achieves a fantastic classification performance that is comparable to supervised learning methods. However, with degraded transfer performance on downstream tasks such as object detection. To bridge the performance gap, we propose a novel object-level self-supervised learning method, called Contrastive learning with Downstream background invariance (CoDo). The pretext task is converted to focus on instance location modeling for various backgrounds, especially for downstream datasets. The ability of background invariance is considered vital for object detection. Firstly, a data augmentation strategy is proposed to paste the instances onto background images, and then jitter the bounding box to involve background information. Secondly, we implement architecture alignment between our pretraining network and the mainstream detection pipelines. Thirdly, hierarchical and multi views contrastive learning is designed to improve performance of visual representation learning. Experiments on MSCOCO demonstrate that the proposed CoDo with common backbones, ResNet50-FPN, yields strong transfer learning results for object detection.

READ FULL TEXT
research
06/04/2021

Aligning Pretraining for Detection via Object-Level Contrastive Learning

Image-level contrastive representation learning has proven to be highly ...
research
03/10/2021

Spatially Consistent Representation Learning

Self-supervised learning has been widely used to obtain transferrable re...
research
02/16/2021

Instance Localization for Self-supervised Detection Pretraining

Prior research on self-supervised learning has led to considerable progr...
research
04/14/2020

Distilling Localization for Self-Supervised Representation Learning

For high-level visual recognition, self-supervised learning defines and ...
research
03/31/2023

INoD: Injected Noise Discriminator for Self-Supervised Representation Learning in Agricultural Fields

Perception datasets for agriculture are limited both in quantity and div...
research
10/29/2022

Pair DETR: Contrastive Learning Speeds Up DETR Training

The DETR object detection approach applies the transformer encoder and d...
research
07/07/2022

An Embedding-Dynamic Approach to Self-supervised Learning

A number of recent self-supervised learning methods have shown impressiv...

Please sign up or login with your details

Forgot password? Click here to reset