Using Simulation to Improve Sample-Efficiency of Bayesian Optimization for Bipedal Robots

by   Akshara Rai, et al.

Learning for control can acquire controllers for novel robotic tasks, paving the path for autonomous agents. Such controllers can be expert-designed policies, which typically require tuning of parameters for each task scenario. In this context, Bayesian optimization (BO) has emerged as a promising approach for automatically tuning controllers. However, when performing BO on hardware for high-dimensional policies, sample-efficiency can be an issue. Here, we develop an approach that utilizes simulation to map the original parameter space into a domain-informed space. During BO, similarity between controllers is now calculated in this transformed space. Experiments on the ATRIAS robot hardware and another bipedal robot simulation show that our approach succeeds at sample-efficiently learning controllers for multiple robots. Another question arises: What if the simulation significantly differs from hardware? To answer this, we create increasingly approximate simulators and study the effect of increasing simulation-hardware mismatch on the performance of Bayesian optimization. We also compare our approach to other approaches from literature, and find it to be more reliable, especially in cases of high mismatch. Our experiments show that our approach succeeds across different controller types, bipedal robot models and simulator fidelity levels, making it applicable to a wide range of bipedal locomotion problems.


page 2

page 13


Tuning Legged Locomotion Controllers via Safe Bayesian Optimization

In this paper, we present a data-driven strategy to simplify the deploym...

Using Deep Reinforcement Learning to Learn High-Level Policies on the ATRIAS Biped

Learning controllers for bipedal robots is a challenging problem, often ...

Bayesian Optimization in Variational Latent Spaces with Dynamic Compression

Data-efficiency is crucial for autonomous robots to adapt to new tasks a...

Sim-to-Real Transfer for Biped Locomotion

We present a new approach for transfer of dynamic robot control policies...

Combining Simulations and Real-robot Experiments for Bayesian Optimization of Bipedal Gait Stabilization

Walking controllers often require parametrization which must be tuned ac...

Bayesian optimization of distributed neurodynamical controller models for spatial navigation

Dynamical systems models for controlling multi-agent swarms have demonst...

Data-efficient Learning of Morphology and Controller for a Microrobot

Robot design is often a slow and difficult process requiring the iterati...

Please sign up or login with your details

Forgot password? Click here to reset