MPC-Net: A First Principles Guided Policy Search

09/11/2019
by   Jan Carius, et al.
0

We present an Imitation Learning approach for the control of dynamical systems with a known model. Our policy search method is guided by solutions from Model Predictive Control (MPC). Contrary to approaches that minimize a distance metric between the guiding demonstrations and the learned policy, our loss function corresponds to the minimization of the control Hamiltonian, which derives from the principle of optimality. Our algorithm, therefore, directly attempts to solve the HJB optimality equation with a parameterized class of control laws. The loss function's explicit encoding of physical constraints manifests in an improved constraint satisfaction metric of the learned controller. We train a mixture-of-expert neural network architecture for controlling a quadrupedal robot and show that this policy structure is well suited for such multimodal systems. The learned policy can successfully stabilize different gaits on the real walking robot from less than 10 min of demonstration data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/26/2021

Imitation Learning from MPC for Quadrupedal Multi-Gait Control

We present a learning algorithm for training a single policy that imitat...
research
04/03/2023

Imitation Learning from Nonlinear MPC via the Exact Q-Loss and its Gauss-Newton Approximation

This work presents a novel loss function for learning nonlinear Model Pr...
research
02/23/2021

Recurrent Model Predictive Control

This paper proposes an off-line algorithm, called Recurrent Model Predic...
research
09/22/2022

Learning Model Predictive Controllers with Real-Time Attention for Real-World Navigation

Despite decades of research, existing navigation systems still face real...
research
09/21/2021

Demonstration-Efficient Guided Policy Search via Imitation of Robust Tube MPC

We propose a demonstration-efficient strategy to compress a computationa...
research
11/09/2018

Sample-Efficient Policy Learning based on Completely Behavior Cloning

Direct policy search is one of the most important algorithm of reinforce...
research
06/19/2019

Safe and Near-Optimal Policy Learning for Model Predictive Control using Primal-Dual Neural Networks

In this paper, we propose a novel framework for approximating the explic...

Please sign up or login with your details

Forgot password? Click here to reset