Efficient Deep Learning of Robust Policies from MPC using Imitation and Tube-Guided Data Augmentation

by   Andrea Tagliabue, et al.

Imitation Learning (IL) has been increasingly employed to generate computationally efficient policies from task-relevant demonstrations provided by Model Predictive Control (MPC). However, commonly employed IL methods are often data- and computationally-inefficient, as they require a large number of MPC demonstrations, resulting in long training times, and they produce policies with limited robustness to disturbances not experienced during training. In this work, we propose an IL strategy to efficiently compress a computationally expensive MPC into a Deep Neural Network (DNN) policy that is robust to previously unseen disturbances. By using a robust variant of the MPC, called Robust Tube MPC (RTMPC), and leveraging properties from the controller, we introduce a computationally-efficient Data Aggregation (DA) method that enables a significant reduction of the number of MPC demonstrations and training time required to generate a robust policy. Our approach opens the possibility of zero-shot transfer of a policy trained from a single MPC demonstration collected in a nominal domain, such as a simulation or a robot in a lab/controlled environment, to a new domain with previously-unseen bounded model errors/perturbations. Numerical and experimental evaluations performed using linear and nonlinear MPC for agile flight on a multirotor show that our method outperforms strategies commonly employed in IL (such as DAgger and DR) in terms of demonstration-efficiency, training time, and robustness to perturbations unseen during training.


page 1

page 14


Demonstration-Efficient Guided Policy Search via Imitation of Robust Tube MPC

We propose a demonstration-efficient strategy to compress a computationa...

Output Feedback Tube MPC-Guided Data Augmentation for Robust, Efficient Sensorimotor Policy Learning

Imitation learning (IL) can generate computationally efficient sensorimo...

MPC-guided Imitation Learning of Neural Network Policies for the Artificial Pancreas

Even though model predictive control (MPC) is currently the main algorit...

Efficient Deep Learning of Robust, Adaptive Policies using Tube MPC-Guided Data Augmentation

The deployment of agile autonomous systems in challenging, unstructured ...

Bayesian Gaussian mixture model for robotic policy imitation

A common approach to learn robotic skills is to imitate a policy demonst...

Computationally Efficient Data-Driven MPC for Agile Quadrotor Flight

This paper develops computationally efficient data-driven model predicti...

An Improved Data Augmentation Scheme for Model Predictive Control Policy Approximation

This paper considers the problem of data generation for MPC policy appro...

Please sign up or login with your details

Forgot password? Click here to reset