Learning to Generalize with Object-centric Agents in the Open World Survival Game Crafter

by   Aleksandar Stanić, et al.

Reinforcement learning agents must generalize beyond their training experience. Prior work has focused mostly on identical training and evaluation environments. Starting from the recently introduced Crafter benchmark, a 2D open world survival game, we introduce a new set of environments suitable for evaluating some agent's ability to generalize on previously unseen (numbers of) objects and to adapt quickly (meta-learning). In Crafter, the agents are evaluated by the number of unlocked achievements (such as collecting resources) when trained for 1M steps. We show that current agents struggle to generalize, and introduce novel object-centric agents that improve over strong baselines. We also provide critical insights of general interest for future work on Crafter through several experiments. We show that careful hyper-parameter tuning improves the PPO baseline agent by a large margin and that even feedforward agents can unlock almost all achievements by relying on the inventory display. We achieve new state-of-the-art performance on the original Crafter environment. Additionally, when trained beyond 1M steps, our tuned agents can unlock almost all achievements. We show that the recurrent PPO agents improve over feedforward ones, even with the inventory information removed. We introduce CrafterOOD, a set of 15 new environments that evaluate OOD generalization. On CrafterOOD, we show that the current agents fail to generalize, whereas our novel object-centric agents achieve state-of-the-art OOD generalization while also being interpretable. Our code is public.


page 1

page 7

page 8

page 9

page 13

page 15

page 17

page 18


Improving Generalization in Reinforcement Learning with Mixture Regularization

Deep reinforcement learning (RL) agents trained in a limited set of envi...

Reinforcement Learning Generalization with Surprise Minimization

Generalization remains a challenging problem for reinforcement learning ...

Towards Few-shot Coordination: Revisiting Ad-hoc Teamplay Challenge In the Game of Hanabi

Cooperative Multi-agent Reinforcement Learning (MARL) algorithms with Ze...

Benchmarking the Spectrum of Agent Capabilities

Evaluating the general abilities of intelligent agents requires complex ...

Improving Baselines in the Wild

We share our experience with the recently released WILDS benchmark, a co...

When Is Generalizable Reinforcement Learning Tractable?

Agents trained by reinforcement learning (RL) often fail to generalize b...

Augmentative Topology Agents For Open-Ended Learning

In this work, we tackle the problem of open-ended learning by introducin...

Please sign up or login with your details

Forgot password? Click here to reset