What Is Missing in IRM Training and Evaluation? Challenges and Solutions

by   Yihua Zhang, et al.

Invariant risk minimization (IRM) has received increasing attention as a way to acquire environment-agnostic data representations and predictions, and as a principled solution for preventing spurious correlations from being learned and for improving models' out-of-distribution generalization. Yet, recent works have found that the optimality of the originally-proposed IRM optimization (IRM) may be compromised in practice or could be impossible to achieve in some scenarios. Therefore, a series of advanced IRM algorithms have been developed that show practical improvement over IRM. In this work, we revisit these recent IRM advancements, and identify and resolve three practical limitations in IRM training and evaluation. First, we find that the effect of batch size during training has been chronically overlooked in previous studies, leaving room for further improvement. We propose small-batch training and highlight the improvements over a set of large-batch optimization techniques. Second, we find that improper selection of evaluation environments could give a false sense of invariance for IRM. To alleviate this effect, we leverage diversified test-time environments to precisely characterize the invariance of IRM when applied in practice. Third, we revisit (Ahuja et al. (2020))'s proposal to convert IRM into an ensemble game and identify a limitation when a single invariant predictor is desired instead of an ensemble of individual predictors. We propose a new IRM variant to address this limitation based on a novel viewpoint of ensemble IRM games as consensus-constrained bi-level optimization. Lastly, we conduct extensive experiments (covering 7 existing IRM variants and 7 datasets) to justify the practical significance of revisiting IRM training and evaluation in a principled manner.


page 1

page 2

page 3

page 4


Invariant Risk Minimization Games

The standard risk minimization paradigm of machine learning is brittle w...

Meta-Learned Invariant Risk Minimization

Empirical Risk Minimization (ERM) based machine learning algorithms have...

Pareto Invariant Risk Minimization

Despite the success of invariant risk minimization (IRM) in tackling the...

On Invariance Penalties for Risk Minimization

The Invariant Risk Minimization (IRM) principle was first proposed by Ar...

Learning Optimal Features via Partial Invariance

Learning models that are robust to test-time distribution shifts is a ke...

Linear Regression Games: Convergence Guarantees to Approximate Out-of-Distribution Solutions

Recently, invariant risk minimization (IRM) (Arjovsky et al.) was propos...

Iterative Feature Matching: Toward Provable Domain Generalization with Logarithmic Environments

Domain generalization aims at performing well on unseen test environment...

Please sign up or login with your details

Forgot password? Click here to reset