IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

02/05/2018
by   Lasse Espeholt, et al.
0

In this work we aim to solve a large collection of tasks using a single reinforcement learning agent with a single set of parameters. A key challenge is to handle the increased amount of data and extended training time, which is already a problem in single task learning. We have developed a new distributed agent IMPALA (Importance-Weighted Actor Learner Architecture) that can scale to thousands of machines and achieve a throughput rate of 250,000 frames per second. We achieve stable learning at high throughput by combining decoupled acting and learning with a novel off-policy correction method called V-trace, which was critical for achieving learning stability. We demonstrate the effectiveness of IMPALA for multi-task reinforcement learning on DMLab-30 (a set of 30 tasks from the DeepMind Lab environment (Beattie et al., 2016)) and Atari-57 (all available Atari games in Arcade Learning Environment (Bellemare et al., 2013a)). Our results show that IMPALA is able to achieve better performance than previous agents, use less data and crucially exhibits positive transfer between tasks as a result of its multi-task approach.

READ FULL TEXT
research
09/12/2018

Multi-task Deep Reinforcement Learning with PopArt

The reinforcement learning community has made great strides in designing...
research
11/22/2020

Distributed Deep Reinforcement Learning: An Overview

Deep reinforcement learning (DRL) is a very active research area. Howeve...
research
11/28/2017

Crossmodal Attentive Skill Learner

This paper presents the Crossmodal Attentive Skill Learner (CASL), integ...
research
11/22/2021

Off-Policy Correction For Multi-Agent Reinforcement Learning

Multi-agent reinforcement learning (MARL) provides a framework for probl...
research
11/28/2022

AcceRL: Policy Acceleration Framework for Deep Reinforcement Learning

Deep reinforcement learning has achieved great success in various fields...
research
11/09/2020

Testbeds for Reinforcement Learning

We present three problems modeled after animal learning experiments desi...
research
06/21/2020

Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning

Increasing the scale of reinforcement learning experiments has allowed r...

Please sign up or login with your details

Forgot password? Click here to reset