Learning Gentle Object Manipulation with Curiosity-Driven Deep Reinforcement Learning

by   Sandy H. Huang, et al.

Robots must know how to be gentle when they need to interact with fragile objects, or when the robot itself is prone to wear and tear. We propose an approach that enables deep reinforcement learning to train policies that are gentle, both during exploration and task execution. In a reward-based learning environment, a natural approach involves augmenting the (task) reward with a penalty for non-gentleness, which can be defined as excessive impact force. However, augmenting with only this penalty impairs learning: policies get stuck in a local optimum which avoids all contact with the environment. Prior research has shown that combining auxiliary tasks or intrinsic rewards can be beneficial for stabilizing and accelerating learning in sparse-reward domains, and indeed we find that introducing a surprise-based intrinsic reward does avoid the no-contact failure case. However, we show that a simple dynamics-based surprise is not as effective as penalty-based surprise. Penalty-based surprise, based on predicting forceful contacts, has a further benefit: it encourages exploration which is contact-rich yet gentle. We demonstrate the effectiveness of the approach using a complex, tendon-powered robot hand with tactile sensors. Videos are available at http://sites.google.com/view/gentlemanipulation.


page 1

page 5

page 7


Improved Learning of Robot Manipulation Tasks via Tactile Intrinsic Motivation

In this paper we address the challenge of exploration in deep reinforcem...

Learning Dense Rewards for Contact-Rich Manipulation Tasks

Rewards play a crucial role in reinforcement learning. To arrive at the ...

Touch-based Curiosity for Sparse-Reward Tasks

Robots in many real-world settings have access to force/torque sensors i...

Imminent Collision Mitigation with Reinforcement Learning and Vision

This work examines the role of reinforcement learning in reducing the se...

Reward-Based Environment States for Robot Manipulation Policy Learning

Training robot manipulation policies is a challenging and open problem i...

Balance Between Efficient and Effective Learning: Dense2Sparse Reward Shaping for Robot Manipulation with Environment Uncertainty

Efficient and effective learning is one of the ultimate goals of the dee...

Emergence of Different Modes of Tool Use in a Reaching and Dragging Task

Tool use is an important milestone in the evolution of intelligence. In ...

Please sign up or login with your details

Forgot password? Click here to reset