Efficient Deep Learning of Robust, Adaptive Policies using Tube MPC-Guided Data Augmentation

by   Tong Zhao, et al.

The deployment of agile autonomous systems in challenging, unstructured environments requires adaptation capabilities and robustness to uncertainties. Existing robust and adaptive controllers, such as the ones based on MPC, can achieve impressive performance at the cost of heavy online onboard computations. Strategies that efficiently learn robust and onboard-deployable policies from MPC have emerged, but they still lack fundamental adaptation capabilities. In this work, we extend an existing efficient IL algorithm for robust policy learning from MPC with the ability to learn policies that adapt to challenging model/environment uncertainties. The key idea of our approach consists in modifying the IL procedure by conditioning the policy on a learned lower-dimensional model/environment representation that can be efficiently estimated online. We tailor our approach to the task of learning an adaptive position and attitude control policy to track trajectories under challenging disturbances on a multirotor. Our evaluation is performed in a high-fidelity simulation environment and shows that a high-quality adaptive policy can be obtained in about 1.3 hours. We additionally empirically demonstrate rapid adaptation to in- and out-of-training-distribution uncertainties, achieving a 6.1 cm average position error under a wind disturbance that corresponds to about 50% of the weight of the robot and that is 36% larger than the maximum wind seen during training.


page 1

page 7


Efficient Deep Learning of Robust Policies from MPC using Imitation and Tube-Guided Data Augmentation

Imitation Learning (IL) has been increasingly employed to generate compu...

Output Feedback Tube MPC-Guided Data Augmentation for Robust, Efficient Sensorimotor Policy Learning

Imitation learning (IL) can generate computationally efficient sensorimo...

Robust, High-Rate Trajectory Tracking on Insect-Scale Soft-Actuated Aerial Robots with Deep-Learned Tube MPC

Accurate and agile trajectory tracking in sub-gram Micro Aerial Vehicles...

Performance, Precision, and Payloads: Adaptive Nonlinear MPC for Quadrotors

Agile quadrotor flight in challenging environments has the potential to ...

Provably Efficient Model-based Policy Adaptation

The high sample complexity of reinforcement learning challenges its use ...

Demonstration-Efficient Guided Policy Search via Imitation of Robust Tube MPC

We propose a demonstration-efficient strategy to compress a computationa...

Adaptive Stochastic MPC under Unknown Noise Distribution

In this paper, we address the stochastic MPC (SMPC) problem for linear s...

Please sign up or login with your details

Forgot password? Click here to reset