Connected Superlevel Set in (Deep) Reinforcement Learning and its Application to Minimax Theorems

03/23/2023
by   Sihan Zeng, et al.
0

The aim of this paper is to improve the understanding of the optimization landscape for policy optimization problems in reinforcement learning. Specifically, we show that the superlevel set of the objective function with respect to the policy parameter is always a connected set both in the tabular setting and under policies represented by a class of neural networks. In addition, we show that the optimization objective as a function of the policy parameter and reward satisfies a stronger "equiconnectedness" property. To our best knowledge, these are novel and previously unknown discoveries. We present an application of the connectedness of these superlevel sets to the derivation of minimax theorems for robust reinforcement learning. We show that any minimax optimization program which is convex on one side and is equiconnected on the other side observes the minimax equality (i.e. has a Nash equilibrium). We find that this exact structure is exhibited by an interesting robust reinforcement learning problem under an adversarial reward attack, and the validity of its minimax equality immediately follows. This is the first time such a result is established in the literature.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/04/2019

Discounted Reinforcement Learning is Not an Optimization Problem

Discounted reinforcement learning is fundamentally incompatible with fun...
research
11/27/2018

Understanding the impact of entropy in policy learning

Entropy regularization is commonly used to improve policy optimization i...
research
11/27/2018

Understanding the impact of entropy on policy optimization

Entropy regularization is commonly used to improve policy optimization i...
research
07/19/2022

Feasible Adversarial Robust Reinforcement Learning for Underspecified Environments

Robust reinforcement learning (RL) considers the problem of learning pol...
research
06/13/2023

On Faking a Nash Equilibrium

We characterize offline data poisoning attacks on Multi-Agent Reinforcem...
research
10/12/2017

Is Epicurus the father of Reinforcement Learning?

The Epicurean Philosophy is commonly thought as simplistic and hedonisti...

Please sign up or login with your details

Forgot password? Click here to reset