Policy Gradients with Variance Related Risk Criteria

06/27/2012
by   Dotan Di Castro, et al.
0

Managing risk in dynamic decision problems is of cardinal importance in many fields such as finance and process control. The most common approach to defining risk is through various variance related criteria such as the Sharpe Ratio or the standard deviation adjusted reward. It is known that optimizing many of the variance related risk criteria is NP-hard. In this paper we devise a framework for local policy gradient style algorithms for reinforcement learning for variance related criteria. Our starting point is a new formula for the variance of the cost-to-go in episodic tasks. Using this formula we develop policy gradient algorithms for criteria that involve both the expected cost and the variance of the cost. We prove the convergence of these algorithms to local minima and demonstrate their applicability in a portfolio planning problem.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/01/2013

Policy Evaluation with Variance Related Risk Criteria in Markov Decision Processes

In this paper we extend temporal difference policy evaluation algorithms...
research
03/25/2014

Variance-Constrained Actor-Critic Algorithms for Discounted and Average Reward MDPs

In many sequential decision-making problems we may want to manage risk b...
research
10/03/2020

Policy Gradient with Expected Quadratic Utility Maximization: A New Mean-Variance Approach in Reinforcement Learning

In real-world decision-making problems, risk management is critical. Amo...
research
07/17/2023

An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient

Restricting the variance of a policy's return is a popular choice in ris...
research
06/15/2022

Mean-Semivariance Policy Optimization via Risk-Averse Reinforcement Learning

Keeping risk under control is often more crucial than maximizing expecte...
research
06/12/2014

Algorithms for CVaR Optimization in MDPs

In many sequential decision-making problems we may want to manage risk b...
research
04/15/2014

Optimizing the CVaR via Sampling

Conditional Value at Risk (CVaR) is a prominent risk measure that is bei...

Please sign up or login with your details

Forgot password? Click here to reset