On Uncertainty in Deep State Space Models for Model-Based Reinforcement Learning

by   Philipp Becker, et al.

Improved state space models, such as Recurrent State Space Models (RSSMs), are a key factor behind recent advances in model-based reinforcement learning (RL). Yet, despite their empirical success, many of the underlying design choices are not well understood. We show that RSSMs use a suboptimal inference scheme and that models trained using this inference overestimate the aleatoric uncertainty of the ground truth system. We find this overestimation implicitly regularizes RSSMs and allows them to succeed in model-based RL. We postulate that this implicit regularization fulfills the same functionality as explicitly modeling epistemic uncertainty, which is crucial for many other model-based RL approaches. Yet, overestimating aleatoric uncertainty can also impair performance in cases where accurately estimating it matters, e.g., when we have to deal with occlusions, missing observations, or fusing sensor modalities at different frequencies. Moreover, the implicit regularization is a side-effect of the inference scheme and not the result of a rigorous, principled formulation, which renders analyzing or improving RSSMs difficult. Thus, we propose an alternative approach building on well-understood components for modeling aleatoric and epistemic uncertainty, dubbed Variational Recurrent Kalman Network (VRKN). This approach uses Kalman updates for exact smoothing inference in a latent space and Monte Carlo Dropout to model epistemic uncertainty. Due to the Kalman updates, the VRKN can naturally handle missing observations or sensor fusion problems with varying numbers of observations per time step. Our experiments show that using the VRKN instead of the RSSM improves performance in tasks where appropriately capturing aleatoric uncertainty is crucial while matching it in the deterministic standard benchmarks.


page 13

page 14

page 31

page 36

page 37

page 38


Disentangling Epistemic and Aleatoric Uncertainty in Reinforcement Learning

Characterizing aleatoric and epistemic uncertainty on the predicted rewa...

Risk Sensitive Model-Based Reinforcement Learning using Uncertainty Guided Planning

Identifying uncertainty and taking mitigating actions is crucial for saf...

Uncertainty-Based Out-of-Distribution Detection in Deep Reinforcement Learning

We consider the problem of detecting out-of-distribution (OOD) samples i...

Recurrent Kalman Networks: Factorized Inference in High-Dimensional Deep Feature Spaces

In order to integrate uncertainty estimates into deep time-series modell...

INFOrmation Prioritization through EmPOWERment in Visual Model-Based RL

Model-based reinforcement learning (RL) algorithms designed for handling...

Bayesian Exploration Networks

Bayesian reinforcement learning (RL) offers a principled and elegant app...

Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability

Generalization is a central challenge for the deployment of reinforcemen...

Please sign up or login with your details

Forgot password? Click here to reset