A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms

by   Yoshua Bengio, et al.

We propose to meta-learn causal structures based on how fast a learner adapts to new distributions arising from sparse distributional changes, e.g. due to interventions, actions of agents and other sources of non-stationarities. We show that under this assumption, the correct causal structural choices lead to faster adaptation to modified distributions because the changes are concentrated in one or just a few mechanisms when the learned knowledge is modularized appropriately. This leads to sparse expected gradients and a lower effective number of degrees of freedom needing to be relearned while adapting to the change. It motivates using the speed of adaptation to a modified distribution as a meta-learning objective. We demonstrate how this can be used to determine the cause-effect relationship between two observed variables. The distributional changes do not need to correspond to standard interventions (clamping a variable), and the learner has no direct knowledge of these interventions. We show that causal structures can be parameterized via continuous variables and learned end-to-end. We then explore how these ideas could be used to also learn an encoder that would map low-level observed variables to unobserved causal variables leading to faster adaptation out-of-distribution, learning a representation space where one can satisfy the assumptions of independent mechanisms and of small and sparse changes in these mechanisms due to actions and non-stationarities.


Learning Neural Causal Models from Unknown Interventions

Meta-learning over a set of distributions can be interpreted as learning...

A Meta Learning Approach to Discerning Causal Graph Structure

We explore the usage of meta-learning to derive the causal direction bet...

Learning Latent Structural Causal Models

Causal learning has long concerned itself with the accurate recovery of ...

Adaptation Speed Analysis for Fairness-aware Causal Models

For example, in machine translation tasks, to achieve bidirectional tran...

An Analysis of the Adaptation Speed of Causal Models

We consider the problem of discovering the causal process that generated...

Fast and Slow Learning of Recurrent Independent Mechanisms

Decomposing knowledge into interchangeable pieces promises a generalizat...

Distributional robustness as a guiding principle for causality in cognitive neuroscience

While probabilistic models describe the dependence structure between obs...

Please sign up or login with your details

Forgot password? Click here to reset