Mean field for Markov Decision Processes: from Discrete to Continuous Optimization

04/14/2010
by   Nicolas Gast, et al.
0

We study the convergence of Markov Decision Processes made of a large number of objects to optimization problems on ordinary differential equations (ODE). We show that the optimal reward of such a Markov Decision Process, satisfying a Bellman equation, converges to the solution of a continuous Hamilton-Jacobi-Bellman (HJB) equation based on the mean field approximation of the Markov Decision Process. We give bounds on the difference of the rewards, and a constructive algorithm for deriving an approximating solution to the Markov Decision Process from a solution of the HJB equations. We illustrate the method on three examples pertaining respectively to investment strategies, population dynamics control and scheduling in queues are developed. They are used to illustrate and justify the construction of the controlled ODE and to show the gain obtained by solving a continuous HJB equation rather than a large discrete Bellman equation.

READ FULL TEXT
research
11/15/2017

Quantile Markov Decision Process

In this paper, we consider the problem of optimizing the quantiles of th...
research
07/02/2019

Learning the Arrow of Time

We humans seem to have an innate understanding of the asymmetric progres...
research
11/29/2022

Airfoil Shape Optimization using Deep Q-Network

The feasibility of using reinforcement learning for airfoil shape optimi...
research
06/26/2020

Approximating Euclidean by Imprecise Markov Decision Processes

Euclidean Markov decision processes are a powerful tool for modeling con...
research
11/22/2017

Budget Allocation in Binary Opinion Dynamics

In this article we study the allocation of a budget to promote an opinio...
research
02/08/2015

Contextual Markov Decision Processes

We consider a planning problem where the dynamics and rewards of the env...
research
04/25/2017

Sufficient Markov Decision Processes with Alternating Deep Neural Networks

Advances in mobile computing technologies have made it possible to monit...

Please sign up or login with your details

Forgot password? Click here to reset