GalilAI: Out-of-Task Distribution Detection using Causal Active Experimentation for Safe Transfer RL

by   Sumedh A. Sontakke, et al.

Out-of-distribution (OOD) detection is a well-studied topic in supervised learning. Extending the successes in supervised learning methods to the reinforcement learning (RL) setting, however, is difficult due to the data generating process - RL agents actively query their environment for data, and the data are a function of the policy followed by the agent. An agent could thus neglect a shift in the environment if its policy did not lead it to explore the aspect of the environment that shifted. Therefore, to achieve safe and robust generalization in RL, there exists an unmet need for OOD detection through active experimentation. Here, we attempt to bridge this lacuna by first defining a causal framework for OOD scenarios or environments encountered by RL agents in the wild. Then, we propose a novel task: that of Out-of-Task Distribution (OOTD) detection. We introduce an RL agent that actively experiments in a test environment and subsequently concludes whether it is OOTD or not. We name our method GalilAI, in honor of Galileo Galilei, as it discovers, among other causal processes, that gravitational acceleration is independent of the mass of a body. Finally, we propose a simple probabilistic neural network baseline for comparison, which extends extant Model-Based RL. We find that GalilAI outperforms the baseline significantly. See visualizations of our method


page 7

page 9


Causal Counterfactuals for Improving the Robustness of Reinforcement Learning

Reinforcement learning (RL) is applied in a wide variety of fields. RL e...

Resolving Spurious Correlations in Causal Models of Environments via Interventions

Causal models could increase interpretability, robustness to distributio...

Learning Causal Overhypotheses through Exploration in Children and Computational Models

Despite recent progress in reinforcement learning (RL), RL algorithms fo...

Reinforcement Learning with Temporal-Logic-Based Causal Diagrams

We study a class of reinforcement learning (RL) tasks where the objectiv...

Causal Influence Detection for Improving Efficiency in Reinforcement Learning

Many reinforcement learning (RL) environments consist of independent ent...

On the Generalization Gap in Reparameterizable Reinforcement Learning

Understanding generalization in reinforcement learning (RL) is a signifi...

EST: Evaluating Scientific Thinking in Artificial Agents

Theoretical ideas and empirical research have shown us a seemingly surpr...

Please sign up or login with your details

Forgot password? Click here to reset