Efficient Conditionally Invariant Representation Learning

by   Roman Pogodin, et al.

We introduce the Conditional Independence Regression CovariancE (CIRCE), a measure of conditional independence for multivariate continuous-valued variables. CIRCE applies as a regularizer in settings where we wish to learn neural features φ(X) of data X to estimate a target Y, while being conditionally independent of a distractor Z given Y. Both Z and Y are assumed to be continuous-valued but relatively low dimensional, whereas X and its features may be complex and high dimensional. Relevant settings include domain-invariant learning, fairness, and causal learning. The procedure requires just a single ridge regression from Y to kernelized features of Z, which can be done in advance. It is then only necessary to enforce independence of φ(X) from residuals of this regression, which is possible with attractive estimation properties and consistency guarantees. By contrast, earlier measures of conditional feature dependence require multiple regressions for each step of feature learning, resulting in more severe bias and variance, and greater computational cost. When sufficiently rich features are used, we establish that CIRCE is zero if and only if φ(X) ⊥⊥ Z | Y. In experiments, we show superior performance to previous methods on challenging benchmarks, including learning conditionally invariant image features.


page 1

page 2

page 3

page 4


The Hardness of Conditional Independence Testing and the Generalised Covariance Measure

It is a common saying that testing for conditional independence, i.e., t...

A Distribution Free Conditional Independence Test with Applications to Causal Discovery

This paper is concerned with test of the conditional independence. We fi...

Conditional Independence Testing via Latent Representation Learning

Detecting conditional independencies plays a key role in several statist...

Order-Independent Structure Learning of Multivariate Regression Chain Graphs

This paper deals with multivariate regression chain graphs (MVR CGs), wh...

Fast Fair Regression via Efficient Approximations of Mutual Information

Most work in algorithmic fairness to date has focused on discrete outcom...

A Neural Mean Embedding Approach for Back-door and Front-door Adjustment

We consider the estimation of average and counterfactual treatment effec...

Architectural Optimization and Feature Learning for High-Dimensional Time Series Datasets

As our ability to sense increases, we are experiencing a transition from...

Please sign up or login with your details

Forgot password? Click here to reset