Trajectory Optimization for Unknown Constrained Systems using Reinforcement Learning

by   Kei Ota, et al.
Mitsubishi Electric Corporation

In this paper, we propose a reinforcement learning-based algorithm for trajectory optimization for constrained dynamical systems. This problem is motivated by the fact that for most robotic systems, the dynamics may not always be known. Generating smooth, dynamically feasible trajectories could be difficult for such systems. Using sampling-based algorithms for motion planning may result in trajectories that are prone to undesirable control jumps. However, they can usually provide a good reference trajectory which a model-free reinforcement learning algorithm can then exploit by limiting the search domain and quickly finding a dynamically smooth trajectory. We use this idea to train a reinforcement learning agent to learn a dynamically smooth trajectory in a curriculum learning setting. Furthermore, for generalization, we parameterize the policies with goal locations, so that the agent can be trained for multiple goals simultaneously. We show result in both simulated environments as well as real experiments, for a 6-DoF manipulator arm operated in position-controlled mode to validate the proposed idea. We compare the proposed ideas against a PID controller which is used to track a designed trajectory in configuration space. Our experiments show that our RL agent trained with a reference path outperformed a model-free PID controller of the type commonly used on many robotic platforms for trajectory tracking.


page 1

page 5


PlanGAN: Model-based Planning With Sparse Rewards and Multiple Goals

Learning with sparse rewards remains a significant challenge in reinforc...

Reinforcement Learning Trajectory Generation and Control for Aggressive Perching on Vertical Walls with Quadrotors

Micro aerial vehicles are widely being researched and employed due to th...

Learning Representative Trajectories of Dynamical Systems via Domain-Adaptive Imitation

Domain-adaptive trajectory imitation is a skill that some predators lear...

Secure Planning Against Stealthy Attacks via Model-Free Reinforcement Learning

We consider the problem of security-aware planning in an unknown stochas...

Learning Stabilizable Dynamical Systems via Control Contraction Metrics

We propose a novel framework for learning stabilizable nonlinear dynamic...

Model-free tracking control of complex dynamical trajectories with machine learning

Nonlinear tracking control enabling a dynamical system to track a desire...

Please sign up or login with your details

Forgot password? Click here to reset