Explaining Reinforcement Learning with Shapley Values

06/09/2023
by   Daniel Beechey, et al.
0

For reinforcement learning systems to be widely adopted, their users must understand and trust them. We present a theoretical analysis of explaining reinforcement learning using Shapley values, following a principled approach from game theory for identifying the contribution of individual players to the outcome of a cooperative game. We call this general framework Shapley Values for Explaining Reinforcement Learning (SVERL). Our analysis exposes the limitations of earlier uses of Shapley values in reinforcement learning. We then develop an approach that uses Shapley values to explain agent performance. In a variety of domains, SVERL produces meaningful explanations that match and supplement human intuition.

READ FULL TEXT
research
11/10/2020

What Did You Think Would Happen? Explaining Agent Behaviour Through Intended Outcomes

We present a novel form of explanation for Reinforcement Learning, based...
research
08/18/2021

Explainable Deep Reinforcement Learning Using Introspection in a Non-episodic Task

Explainable reinforcement learning allows artificial agents to explain t...
research
10/11/2020

Contrastive Explanations for Reinforcement Learning via Embedded Self Predictions

We investigate a deep reinforcement learning (RL) architecture that supp...
research
09/09/2022

Shapley value-based approaches to explain the robustness of classifiers in machine learning

In machine learning, the use of algorithm-agnostic approaches is an emer...
research
02/10/2020

Qatten: A General Framework for Cooperative Multiagent Reinforcement Learning

In many real-world settings, a team of cooperative agents must learn to ...
research
04/16/2020

Solving bitvectors with MCSAT: explanations from bits and pieces (long version)

We present a decision procedure for the theory of fixed-sized bitvectors...

Please sign up or login with your details

Forgot password? Click here to reset