Comparison of Reinforcement Learning algorithms applied to the Cart Pole problem

by   Savinay Nagendra, et al.

Designing optimal controllers continues to be challenging as systems are becoming complex and are inherently nonlinear. The principal advantage of reinforcement learning (RL) is its ability to learn from the interaction with the environment and provide optimal control strategy. In this paper, RL is explored in the context of control of the benchmark cartpole dynamical system with no prior knowledge of the dynamics. RL algorithms such as temporal-difference, policy gradient actor-critic, and value function approximation are compared in this context with the standard LQR solution. Further, we propose a novel approach to integrate RL and swing-up controllers.


Parameter-based Value Functions

Learning value functions off-policy is at the core of modern Reinforceme...

Recomposing the Reinforcement Learning Building Blocks with Hypernetworks

The Reinforcement Learning (RL) building blocks, i.e. Q-functions and po...

Out-of-Distribution Dynamics Detection: RL-Relevant Benchmarks and Results

We study the problem of out-of-distribution dynamics (OODD) detection, w...

On the Convergence of Reinforcement Learning

We consider the problem of Reinforcement Learning for nonlinear stochast...

DyNODE: Neural Ordinary Differential Equations for Dynamics Modeling in Continuous Control

We present a novel approach (DyNODE) that captures the underlying dynami...

Reinforcement Learning for Control of Valves

This paper compares reinforcement learning (RL) with PID (proportional-i...

Learning Dynamics and Generalization in Reinforcement Learning

Solving a reinforcement learning (RL) problem poses two competing challe...

Please sign up or login with your details

Forgot password? Click here to reset