Improving the Generalization of Supervised Models

We consider the problem of training a deep neural network on a given classification task, e.g., ImageNet-1K (IN1K), so that it excels at that task as well as at other (future) transfer tasks. These two seemingly contradictory properties impose a trade-off between improving the model's generalization while maintaining its performance on the original task. Models trained with self-supervised learning (SSL) tend to generalize better than their supervised counterparts for transfer learning; yet, they still lag behind supervised models on IN1K. In this paper, we propose a supervised learning setup that leverages the best of both worlds. We enrich the common supervised training framework using two key components of recent SSL models: multi-scale crops for data augmentation and the use of an expendable projector head. We replace the last layer of class weights with class prototypes computed on the fly using a memory bank. We show that these three improvements lead to a more favorable trade-off between the IN1K training task and 13 transfer tasks. Over all the explored configurations, we single out two models: t-ReX that achieves a new state of the art for transfer learning and outperforms top methods such as DINO and PAWS on IN1K, and t-ReX* that matches the highly optimized RSB-A1 model on IN1K while performing better on transfer tasks. Project page and pretrained models:


page 1

page 2

page 3

page 4


Boosting Self-Supervised Learning via Knowledge Transfer

In self-supervised learning, one trains a model to solve a so-called pre...

Guillotine Regularization: Improving Deep Networks Generalization by Removing their Head

One unexpected technique that emerged in recent years consists in traini...

Removing Rain Streaks via Task Transfer Learning

Due to the difficulty in collecting paired real-world training data, ima...

Training Deep Networks from Zero to Hero: avoiding pitfalls and going beyond

Training deep neural networks may be challenging in real world data. Usi...

The Geometry of Self-supervised Learning Models and its Impact on Transfer Learning

Self-supervised learning (SSL) has emerged as a desirable paradigm in co...

Self supervised learning improves dMMR/MSI detection from histology slides across multiple cancers

Microsatellite instability (MSI) is a tumor phenotype whose diagnosis la...

Technical Report: Combining knowledge from Transfer Learning during training and Wide Resnets

In this report, we combine the idea of Wide ResNets and transfer learnin...

Please sign up or login with your details

Forgot password? Click here to reset