DeepAI AI Chat
Log In Sign Up

On the Emergence of Whole-body Strategies from Humanoid Robot Push-recovery Learning

by   Diego Ferigo, et al.

Balancing and push-recovery are essential capabilities enabling humanoid robots to solve complex locomotion tasks. In this context, classical control systems tend to be based on simplified physical models and hard-coded strategies. Although successful in specific scenarios, this approach requires demanding tuning of parameters and switching logic between specifically-designed controllers for handling more general perturbations. We apply model-free Deep Reinforcement Learning for training a general and robust humanoid push-recovery policy in a simulation environment. Our method targets high-dimensional whole-body humanoid control and is validated on the iCub humanoid. Reward components incorporating expert knowledge on humanoid control enable fast learning of several robust behaviors by the same policy, spanning the entire body. We validate our method with extensive quantitative analyses in simulation, including out-of-sample tasks which demonstrate policy robustness and generalization, both key requirements towards real-world robot deployment.


page 1

page 7


Learning Whole-body Motor Skills for Humanoids

This paper presents a hierarchical framework for Deep Reinforcement Lear...

Robust Recovery Motion Control for Quadrupedal Robots via Learned Terrain Imagination

Quadrupedal robots have emerged as a cutting-edge platform for assisting...

Emergence of Human-comparable Balancing Behaviors by Deep Reinforcement Learning

This paper presents a hierarchical framework based on deep reinforcement...

Data Driven Computational Model for Bipedal Walking and Push Recovery

In this research, we have developed the data driven computational walkin...

Integration of Riemannian Motion Policy and Whole-Body Control for Dynamic Legged Locomotion

In this paper, we present a novel Riemannian Motion Policy (RMP)flow-bas...

Scaling MAP-Elites to Deep Neuroevolution

Quality-Diversity (QD) algorithms, and MAP-Elites (ME) in particular, ha...