Constrained episodic reinforcement learning in concave-convex and knapsack settings

06/09/2020
by   Kianté Brantley, et al.
8

We propose an algorithm for tabular episodic reinforcement learning with constraints. We provide a modular analysis with strong theoretical guarantees for settings with concave rewards and convex constraints, and for settings with hard constraints (knapsacks). Most of the previous work in constrained reinforcement learning is limited to linear constraints, and the remaining work focuses on either the feasibility question or settings with a single episode. Our experiments demonstrate that the proposed algorithm significantly outperforms these approaches in existing constrained episodic environments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/22/2021

Escaping from Zero Gradient: Revisiting Action-Constrained Reinforcement Learning via Frank-Wolfe Policy Optimization

Action-constrained reinforcement learning (RL) is a widely-used approach...
research
09/12/2021

Concave Utility Reinforcement Learning with Zero-Constraint Violations

We consider the problem of tabular infinite horizon concave utility rein...
research
10/20/2020

Robust Constrained Reinforcement Learning for Continuous Control with Model Misspecification

Many real-world physical control systems are required to satisfy constra...
research
05/25/2023

Learning Safety Constraints from Demonstrations with Unknown Rewards

We propose Convex Constraint Learning for Reinforcement Learning (CoCoRL...
research
02/21/2022

A Globally Convergent Evolutionary Strategy for Stochastic Constrained Optimization with Applications to Reinforcement Learning

Evolutionary strategies have recently been shown to achieve competing le...
research
11/10/2021

DeCOM: Decomposed Policy for Constrained Cooperative Multi-Agent Reinforcement Learning

In recent years, multi-agent reinforcement learning (MARL) has presented...
research
09/08/2022

An Empirical Evaluation of Posterior Sampling for Constrained Reinforcement Learning

We study a posterior sampling approach to efficient exploration in const...

Please sign up or login with your details

Forgot password? Click here to reset