Reliable validation of Reinforcement Learning Benchmarks

Reinforcement Learning (RL) is one of the most dynamic research areas in Game AI and AI as a whole, and a wide variety of games are used as its prominent test problems. However, it is subject to the replicability crisis that currently affects most algorithmic AI research. Benchmarking in Reinforcement Learning could be improved through verifiable results. There are numerous benchmark environments whose scores are used to compare different algorithms, such as Atari. Nevertheless, reviewers must trust that figures represent truthful values, as it is difficult to reproduce an exact training curve. We propose improving this situation by providing access to the original experimental data to validate study results. To that end, we rely on the concept of minimal traces. These allow re-simulation of action sequences in deterministic RL environments and, in turn, enable reviewers to verify, re-use, and manually inspect experimental results without needing large compute clusters. It also permits validation of presented reward graphs, an inspection of individual episodes, and re-use of result data (baselines) for proper comparison in follow-up papers. We offer plug-and-play code that works with Gym so that our measures fit well in the existing RL and reproducibility eco-system. Our approach is freely available, easy to use, and adds minimal overhead, as minimal traces allow a data compression ratio of up to ≈ 10^4:1 (94GB to 8MB for Atari Pong) compared to a regular MDP trace used in offline RL datasets. The paper presents proof-of-concept results for a variety of games.

READ FULL TEXT
research
11/12/2020

Griddly: A platform for AI research in games

In recent years, there have been immense breakthroughs in Game AI resear...
research
09/09/2019

A Survey on Reproducibility by Evaluating Deep Reinforcement Learning Algorithms on Real-World Robots

As reinforcement learning (RL) achieves more success in solving complex ...
research
01/26/2018

FlashRL: A Reinforcement Learning Platform for Flash Games

Reinforcement Learning (RL) is a research area that has blossomed tremen...
research
07/19/2023

PyTAG: Challenges and Opportunities for Reinforcement Learning in Tabletop Games

In recent years, Game AI research has made important breakthroughs using...
research
04/16/2021

Towards Standardizing Reinforcement Learning Approaches for Stochastic Production Scheduling

Recent years have seen a rise in interest in terms of using machine lear...
research
01/29/2018

Deep Reinforcement Learning using Capsules in Advanced Game Environments

Reinforcement Learning (RL) is a research area that has blossomed tremen...

Please sign up or login with your details

Forgot password? Click here to reset