Read and Reap the Rewards: Learning to Play Atari with the Help of Instruction Manuals

02/09/2023
by   Yue Wu, et al.
0

High sample complexity has long been a challenge for RL. On the other hand, humans learn to perform tasks not only from interaction or demonstrations, but also by reading unstructured text documents, e.g., instruction manuals. Instruction manuals and wiki pages are among the most abundant data that could inform agents of valuable features and policies or task-specific environmental dynamics and reward structures. Therefore, we hypothesize that the ability to utilize human-written instruction manuals to assist learning policies for specific tasks should lead to a more efficient and better-performing agent. We propose the Read and Reward framework. Read and Reward speeds up RL algorithms on Atari games by reading manuals released by the Atari game developers. Our framework consists of a QA Extraction module that extracts and summarizes relevant information from the manual and a Reasoning module that evaluates object-agent interactions based on information from the manual. Auxiliary reward is then provided to a standard A2C RL agent, when interaction is detected. When assisted by our design, A2C improves on 4 games in the Atari environment with sparse rewards, and requires 1000x less training frames compared to the previous SOTA Agent 57 on Skiing, the hardest game in Atari.

READ FULL TEXT

page 5

page 6

research
02/24/2021

PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning

We study reinforcement learning (RL) with no-reward demonstrations, a se...
research
02/21/2023

Potential-based reward shaping for learning to play text-based adventure games

Text-based games are a popular testbed for language-based reinforcement ...
research
10/28/2020

Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in First-person Simulated 3D Environments

First-person object-interaction tasks in high-fidelity, 3D, simulated en...
research
05/26/2023

A Reminder of its Brittleness: Language Reward Shaping May Hinder Learning for Instruction Following Agents

Teaching agents to follow complex written instructions has been an impor...
research
03/16/2021

Learning to Shape Rewards using a Game of Switching Controls

Reward shaping (RS) is a powerful method in reinforcement learning (RL) ...
research
06/20/2022

EAGER: Asking and Answering Questions for Automatic Reward Shaping in Language-guided RL

Reinforcement learning (RL) in long horizon and sparse reward tasks is n...

Please sign up or login with your details

Forgot password? Click here to reset