Multi-Stage Episodic Control for Strategic Exploration in Text Games

by   Jens Tuyls, et al.

Text adventure games present unique challenges to reinforcement learning methods due to their combinatorially large action spaces and sparse rewards. The interplay of these two factors is particularly demanding because large action spaces require extensive exploration, while sparse rewards provide limited feedback. This work proposes to tackle the explore-vs-exploit dilemma using a multi-stage approach that explicitly disentangles these two strategies within each episode. Our algorithm, called eXploit-Then-eXplore (XTX), begins each episode using an exploitation policy that imitates a set of promising trajectories from the past, and then switches over to an exploration policy aimed at discovering novel actions that lead to unseen state spaces. This policy decomposition allows us to combine global decisions about which parts of the game space to return to with curiosity-based local exploration in that space, motivated by how a human may approach these games. Our method significantly outperforms prior approaches by 27 score over 12 games from the Jericho benchmark (Hausknecht et al., 2020) in both deterministic and stochastic settings, respectively. On the game of Zork1, in particular, XTX obtains a score of 103, more than a 2x improvement over prior methods, and pushes past several known bottlenecks in the game that have plagued previous state-of-the-art methods.


Exploration Based Language Learning for Text-Based Games

This work presents an exploration and imitation-learning-based agent cap...

Keep CALM and Explore: Language Models for Action Generation in Text-based Games

Text-based games present a unique challenge for autonomous agents to ope...

On Bonus-Based Exploration Methods in the Arcade Learning Environment

Research on exploration in reinforcement learning, as applied to Atari 2...

A Minimal Approach for Natural Language Action Space in Text-based Games

Text-based games (TGs) are language-based interactive environments for r...

How To Avoid Being Eaten By a Grue: Exploration Strategies for Text-Adventure Agents

Text-based games – in which an agent interacts with the world through te...

Towards optimized actions in critical situations of soccer games with deep reinforcement learning

Soccer is a sparse rewarding game: any smart or careless action in criti...

Towards Solving Text-based Games by Producing Adaptive Action Spaces

To solve a text-based game, an agent needs to formulate valid text comma...

Please sign up or login with your details

Forgot password? Click here to reset