Fishr: Invariant Gradient Variances for Out-of-distribution Generalization

09/07/2021
by   Alexandre Ramé, et al.
0

Learning robust models that generalize well under changes in the data distribution is critical for real-world applications. To this end, there has been a growing surge of interest to learn simultaneously from multiple training domains - while enforcing different types of invariance across those domains. Yet, all existing approaches fail to show systematic benefits under fair evaluation protocols. In this paper, we propose a new learning scheme to enforce domain invariance in the space of the gradients of the loss function: specifically, we introduce a regularization term that matches the domain-level variances of gradients across training domains. Critically, our strategy, named Fishr, exhibits close relations with the Fisher Information and the Hessian of the loss. We show that forcing domain-level gradient covariances to be similar during the learning procedure eventually aligns the domain-level loss landscapes locally around the final weights. Extensive experiments demonstrate the effectiveness of Fishr for out-of-distribution generalization. In particular, Fishr improves the state of the art on the DomainBed benchmark and performs significantly better than Empirical Risk Minimization. The code is released at https://github.com/alexrame/fishr.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/04/2021

SAND-mask: An Enhanced Gradient Masking Strategy for the Discovery of Invariances in Domain Generalization

A major bottleneck in the real-world applications of machine learning mo...
research
08/22/2023

Domain Generalization via Rationale Invariance

This paper offers a new perspective to ease the challenge of domain gene...
research
05/02/2023

PGrad: Learning Principal Gradients For Domain Generalization

Machine learning models fail to perform when facing out-of-distribution ...
research
04/04/2023

ERM++: An Improved Baseline for Domain Generalization

Multi-source Domain Generalization (DG) measures a classifier's ability ...
research
10/13/2022

Improving Out-of-Distribution Generalization by Adversarial Training with Structured Priors

Deep models often fail to generalize well in test domains when the data ...
research
10/29/2021

Generalized Data Weighting via Class-level Gradient Manipulation

Label noise and class imbalance are two major issues coexisting in real-...
research
06/08/2022

Sparse Fusion Mixture-of-Experts are Domain Generalizable Learners

Domain generalization (DG) aims at learning generalizable models under d...

Please sign up or login with your details

Forgot password? Click here to reset