Parametric Return Density Estimation for Reinforcement Learning

03/15/2012
by   Tetsuro Morimura, et al.
0

Most conventional Reinforcement Learning (RL) algorithms aim to optimize decision-making rules in terms of the expected returns. However, especially for risk management purposes, other risk-sensitive criteria such as the value-at-risk or the expected shortfall are sometimes preferred in real applications. Here, we describe a parametric method for estimating density of the returns, which allows us to handle various criteria in a unified manner. We first extend the Bellman equation for the conditional expected return to cover a conditional probability density of the returns. Then we derive an extension of the TD-learning algorithm for estimating the return densities in an unknown environment. As test instances, several parametric density estimation algorithms are presented for the Gaussian, Laplace, and skewed Laplace distributions. We show that these algorithms lead to risk-sensitive as well as robust RL paradigms through numerical experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/30/2022

Risk-Sensitive Policy with Distributional Reinforcement Learning

Classical reinforcement learning (RL) techniques are generally concerned...
research
12/15/2017

A nonparametric copula approach to conditional Value-at-Risk

Value-at-Risk and its conditional allegory, which takes into account the...
research
11/04/2021

Model-Free Risk-Sensitive Reinforcement Learning

We extend temporal-difference (TD) learning in order to obtain risk-sens...
research
09/10/2019

Virtual Historical Simulation for estimating the conditional VaR of large portfolios

In order to estimate the conditional risk of a portfolio's return, two s...
research
10/05/2021

Parametric study of E. coli incidence with reference to the New Zealand freshwater standards and the Manawatū-Whanganui region

The New Zealand National Policy Statement for Freshwater Management 2020...
research
11/12/2019

An Unethical Optimization Principle

If an artificial intelligence aims to maximise risk-adjusted return, the...
research
05/31/2022

You Can't Count on Luck: Why Decision Transformers Fail in Stochastic Environments

Recently, methods such as Decision Transformer that reduce reinforcement...

Please sign up or login with your details

Forgot password? Click here to reset