Reinforcement Learning with Tensor Networks: Application to Dynamical Large Deviations

09/28/2022
by   Edward Gillman, et al.
0

We present a framework to integrate tensor network (TN) methods with reinforcement learning (RL) for solving dynamical optimisation tasks. We consider the RL actor-critic method, a model-free approach for solving RL problems, and introduce TNs as the approximators for its policy and value functions. Our "actor-critic with tensor networks" (ACTeN) method is especially well suited to problems with large and factorisable state and action spaces. As an illustration of the applicability of ACTeN we solve the exponentially hard task of sampling rare trajectories in two paradigmatic stochastic models, the East model of glasses and the asymmetric simple exclusion process (ASEP), the latter being particularly challenging to other methods due to the absence of detailed balance. With substantial potential for further integration with the vast array of existing RL methods, the approach introduced here is promising both for applications in physics and to multi-agent RL problems more generally.

READ FULL TEXT
research
04/22/2022

TASAC: a twin-actor reinforcement learning framework with stochastic policy for batch process control

Due to their complex nonlinear dynamics and batch-to-batch variability, ...
research
10/31/2021

An Actor-Critic Method for Simulation-Based Optimization

We focus on a simulation-based optimization problem of choosing the best...
research
01/29/2019

Emergence of Hierarchy via Reinforcement Learning Using a Multiple Timescale Stochastic RNN

Although recurrent neural networks (RNNs) for reinforcement learning (RL...
research
07/02/2021

Reinforcement Learning Provides a Flexible Approach for Realistic Supply Chain Safety Stock Optimisation

Although safety stock optimisation has been studied for more than 60 yea...
research
05/30/2022

Stock Trading Optimization through Model-based Reinforcement Learning with Resistance Support Relative Strength

Reinforcement learning (RL) is gaining attention by more and more resear...
research
07/20/2021

Proximal Policy Optimization for Tracking Control Exploiting Future Reference Information

In recent years, reinforcement learning (RL) has gained increasing atten...
research
12/02/2020

Pareto Deterministic Policy Gradients and Its Application in 5G Massive MIMO Networks

In this paper, we consider jointly optimizing cell load balance and netw...

Please sign up or login with your details

Forgot password? Click here to reset