Quantifying Generalization in Reinforcement Learning

12/06/2018
by   Karl Cobbe, et al.
14

In this paper, we investigate the problem of overfitting in deep reinforcement learning. Among the most common benchmarks in RL, it is customary to use the same environments for both training and testing. This practice offers relatively little insight into an agent's ability to generalize. We address this issue by using procedurally generated environments to construct distinct training and test sets. Most notably, we introduce a new environment called CoinRun, designed as a benchmark for generalization in RL. Using CoinRun, we find that agents overfit to surprisingly large training sets. We then show that deeper convolutional architectures improve generalization, as do methods traditionally found in supervised learning, including L2 regularization, dropout, data augmentation and batch normalization.

READ FULL TEXT

page 3

page 8

page 11

page 17

page 18

page 19

research
06/23/2020

Automatic Data Augmentation for Generalization in Deep Reinforcement Learning

Deep reinforcement learning (RL) agents often fail to generalize to unse...
research
10/28/2019

Generalization in Reinforcement Learning with Selective Noise Injection and Information Bottleneck

The ability for policies to generalize to new environments is key to the...
research
09/29/2018

Generalization and Regularization in DQN

Deep reinforcement learning (RL) algorithms have shown an impressive abi...
research
02/17/2021

Time Matters in Using Data Augmentation for Vision-based Deep Reinforcement Learning

Data augmentation technique from computer vision has been widely conside...
research
04/13/2022

Local Feature Swapping for Generalization in Reinforcement Learning

Over the past few years, the acceleration of computing resources and res...
research
12/03/2019

Leveraging Procedural Generation to Benchmark Reinforcement Learning

In this report, we introduce Procgen Benchmark, a suite of 16 procedural...
research
07/13/2022

Brick Tic-Tac-Toe: Exploring the Generalizability of AlphaZero to Novel Test Environments

Traditional reinforcement learning (RL) environments typically are the s...

Please sign up or login with your details

Forgot password? Click here to reset