Quantile Markov Decision Process

11/15/2017
by   Xiaocheng Li, et al.
0

In this paper, we consider the problem of optimizing the quantiles of the cumulative rewards of Markov Decision Processes (MDP), to which we refers as Quantile Markov Decision Processes (QMDP). Traditionally, the goal of a Markov Decision Process (MDP) is to maximize expected cumulative reward over a defined horizon (possibly to be infinite). In many applications, however, a decision maker may be interested in optimizing a specific quantile of the cumulative reward instead of its expectation. (If we have some reference here, it would be good.) Our framework of QMDP provides analytical results characterizing the optimal QMDP solution and presents the algorithm for solving the QMDP. We provide analytical results characterizing the optimal QMDP solution and present the algorithms for solving the QMDP. We illustrate the model with two experiments: a grid game and a HIV optimal treatment experiment.

READ FULL TEXT
research
12/01/2016

Optimizing Quantiles in Preference-based Markov Decision Processes

In the Markov decision process model, policies are usually evaluated by ...
research
04/14/2010

Mean field for Markov Decision Processes: from Discrete to Continuous Optimization

We study the convergence of Markov Decision Processes made of a large nu...
research
04/20/2015

Optimal Nudging: Solving Average-Reward Semi-Markov Decision Processes as a Minimal Sequence of Cumulative Tasks

This paper describes a novel method to solve average-reward semi-Markov ...
research
09/01/2023

Learning Risk Preferences in Markov Decision Processes: an Application to the Fourth Down Decision in Football

For decades, National Football League (NFL) coaches' observed fourth dow...
research
05/10/2017

Solving Multi-Objective MDP with Lexicographic Preference: An application to stochastic planning with multiple quantile objective

In most common settings of Markov Decision Process (MDP), an agent evalu...
research
09/26/2013

Approximation of Lorenz-Optimal Solutions in Multiobjective Markov Decision Processes

This paper is devoted to fair optimization in Multiobjective Markov Deci...
research
07/10/2019

Markov Decision Process for MOOC users behavioral inference

Studies on massive open online courses (MOOCs) users discuss the existen...

Please sign up or login with your details

Forgot password? Click here to reset