Real-Time Optimal Guidance and Control for Interplanetary Transfers Using Deep Networks

by   Dario Izzo, et al.

We consider the Earth-Venus mass-optimal interplanetary transfer of a low-thrust spacecraft and show how the optimal guidance can be represented by deep networks in a large portion of the state space and to a high degree of accuracy. Imitation (supervised) learning of optimal examples is used as a network training paradigm. The resulting models are suitable for an on-board, real-time, implementation of the optimal guidance and control system of the spacecraft and are called G CNETs. A new general methodology called Backward Generation of Optimal Examples is introduced and shown to be able to efficiently create all the optimal state action pairs necessary to train G CNETs without solving optimal control problems. With respect to previous works, we are able to produce datasets containing a few orders of magnitude more optimal trajectories and obtain network performances compatible with real missions requirements. Several schemes able to train representations of either the optimal policy (thrust profile) or the value function (optimal mass) are proposed and tested. We find that both policy learning and value function learning successfully and accurately learn the optimal thrust and that a spacecraft employing the learned thrust is able to reach the target conditions orbit spending only 2 permil more propellant than in the corresponding mathematically optimal transfer. Moreover, the optimal propellant mass can be predicted (in case of value function learning) within an error well within 1 All G CNETs produced are tested during simulations of interplanetary transfers with respect to their ability to reach the target conditions optimally starting from nominal and off-nominal conditions.


page 1

page 2

page 3

page 4


Interplanetary Transfers via Deep Representations of the Optimal Policy and/or of the Value Function

A number of applications to interplanetary trajectories have been recent...

Neural representation of a time optimal, constant acceleration rendezvous

We train neural models to represent both the optimal policy (i.e. the op...

HJB Optimal Feedback Control with Deep Differential Value Functions and Action Constraints

Learning optimal feedback control laws capable of executing optimal traj...

Infinite-Horizon Reach-Avoid Zero-Sum Games via Deep Reinforcement Learning

In this paper, we consider the infinite-horizon reach-avoid zero-sum gam...

Approximation of the value function for optimal control problems on stratified domains

In optimal control problems defined on stratified domains, the dynamics ...

Estimating Q(s,s') with Deep Deterministic Dynamics Gradients

In this paper, we introduce a novel form of value function, Q(s, s'), th...

Neural-Rendezvous: Learning-based Robust Guidance and Control to Encounter Interstellar Objects

Interstellar objects (ISOs), astronomical objects not gravitationally bo...

Please sign up or login with your details

Forgot password? Click here to reset