Divide-and-Conquer Reinforcement Learning

11/27/2017
by   Dibya Ghosh, et al.
0

Standard model-free deep reinforcement learning (RL) algorithms sample a new initial state for each trial, allowing them to optimize policies that can perform well even in highly stochastic environments. However, problems that exhibit considerable initial state variation typically produce high-variance gradient estimates for model-free RL, making direct policy or value function optimization challenging. In this paper, we develop a novel algorithm that instead optimizes an ensemble of policies, each on a different "slice" of the initial state space, and gradually unifies them into a single policy that can succeed on the whole state space. This approach, which we term divide-and-conquer RL, is able to solve complex tasks where conventional deep RL methods are ineffective. Our results show that divide-and-conquer RL greatly outperforms conventional policy gradient methods on challenging grasping, manipulation, and locomotion tasks, and exceeds the performance of a variety of prior methods.

READ FULL TEXT
research
09/07/2019

Soft Policy Gradient Method for Maximum Entropy Deep Reinforcement Learning

Maximum entropy deep reinforcement learning (RL) methods have been demon...
research
11/29/2022

Approximating Martingale Process for Variance Reduction in Deep Reinforcement Learning with Large State Space

Approximating Martingale Process (AMP) is proven to be effective for var...
research
09/22/2021

MEPG: A Minimalist Ensemble Policy Gradient Framework for Deep Reinforcement Learning

Ensemble reinforcement learning (RL) aims to mitigate instability in Q-l...
research
09/26/2021

On the Feasibility of Learning Finger-gaiting In-hand Manipulation with Intrinsic Sensing

Finger-gaiting manipulation is an important skill to achieve large-angle...
research
03/07/2019

When random search is not enough: Sample-Efficient and Noise-Robust Blackbox Optimization of RL Policies

Interest in derivative-free optimization (DFO) and "evolutionary strateg...
research
08/11/2023

A Deep Recurrent-Reinforcement Learning Method for Intelligent AutoScaling of Serverless Functions

Function-as-a-Service (FaaS) introduces a lightweight, function-based cl...
research
05/14/2019

Control Regularization for Reduced Variance Reinforcement Learning

Dealing with high variance is a significant challenge in model-free rein...

Please sign up or login with your details

Forgot password? Click here to reset