Feature-Suppressed Contrast for Self-Supervised Food Pre-training

by   Xinda Liu, et al.

Most previous approaches for analyzing food images have relied on extensively annotated datasets, resulting in significant human labeling expenses due to the varied and intricate nature of such images. Inspired by the effectiveness of contrastive self-supervised methods in utilizing unlabelled data, weiqing explore leveraging these techniques on unlabelled food images. In contrastive self-supervised methods, two views are randomly generated from an image by data augmentations. However, regarding food images, the two views tend to contain similar informative contents, causing large mutual information, which impedes the efficacy of contrastive self-supervised learning. To address this problem, we propose Feature Suppressed Contrast (FeaSC) to reduce mutual information between views. As the similar contents of the two views are salient or highly responsive in the feature map, the proposed FeaSC uses a response-aware scheme to localize salient features in an unsupervised manner. By suppressing some salient features in one view while leaving another contrast view unchanged, the mutual information between the two views is reduced, thereby enhancing the effectiveness of contrast learning for self-supervised food pre-training. As a plug-and-play module, the proposed method consistently improves BYOL and SimSiam by 1.70% ∼ 6.69% classification accuracy on four publicly available food recognition datasets. Superior results have also been achieved on downstream segmentation tasks, demonstrating the effectiveness of the proposed method.


page 2

page 4


What makes for good views for contrastive learning

Contrastive learning between multiple views of the data has recently ach...

TVDIM: Enhancing Image Self-Supervised Pretraining via Noisy Text Data

Among ubiquitous multimodal data in the real world, text is the modality...

Self-Distilled Self-Supervised Representation Learning

State-of-the-art frameworks in self-supervised learning have recently sh...

miCSE: Mutual Information Contrastive Learning for Low-shot Sentence Embeddings

This paper presents miCSE, a mutual information-based Contrastive learni...

Fair contrastive pre-training for geographic images

Contrastive representation learning is widely employed in visual recogni...

Hallucination Improves the Performance of Unsupervised Visual Representation Learning

Contrastive learning models based on Siamese structure have demonstrated...

Saliency Can Be All You Need In Contrastive Self-Supervised Learning

We propose an augmentation policy for Contrastive Self-Supervised Learni...

Please sign up or login with your details

Forgot password? Click here to reset