Algorithms for offline bandits must optimize decisions in uncertain
envi...
Optimizing static risk-averse objectives in Markov decision processes is...
We introduce the Blackwell discount factor for Markov Decision Processes...
Robust Markov decision processes (RMDPs) are promising models that provi...
Robust Markov decision processes (MDPs) are used for applications of dyn...
Prior work on safe Reinforcement Learning (RL) has studied risk-aversion...
In recent years, robust Markov decision processes (MDPs) have emerged as...
The difficulty in specifying rewards for many real-world problems has le...
Imitation learning (IL) algorithms use expert demonstrations to learn a
...
In reinforcement learning, robust policies for high-stakes decision-maki...
One of the main challenges in imitation learning is determining what act...
Having a perfect model to compute the optimal policy is often infeasible...
Robust Markov decision processes (MDPs) allow to compute reliable soluti...
In this paper, we introduce proximal gradient temporal difference learni...
Optimal policies in Markov decision processes (MDPs) are very sensitive ...
Robust MDPs are a promising framework for computing robust policies in
r...
Optimism about the poorly understood states and actions is the main driv...
Robust MDPs (RMDPs) can be used to compute policies with provable worst-...
Robustness is important for sequential decision making in a stochastic
d...
We propose to use boosted regression trees as a way to compute
human-int...
Many efficient algorithms with strong theoretical guarantees have been
p...
Multi-armed bandits are a quintessential machine learning problem requir...
An important problem in sequential decision-making under uncertainty is ...
We propose a method for building an interpretable recommender system for...
Randomized matrix compression techniques, such as the Johnson-Lindenstra...
Multiagent planning and coordination problems are common and known to be...
We propose solution methods for previously-unsolved constrained MDPs in ...
Stochastic domains often involve risk-averse decision makers. While rece...
Approximate dynamic programming is a popular method for solving large Ma...
Approximate dynamic programming has been used successfully in a large va...