Provably Efficient Representation Learning with Tractable Planning in Low-Rank POMDP

06/21/2023
by   Jiacheng Guo, et al.
0

In this paper, we study representation learning in partially observable Markov Decision Processes (POMDPs), where the agent learns a decoder function that maps a series of high-dimensional raw observations to a compact representation and uses it for more efficient exploration and planning. We focus our attention on the sub-classes of γ-observable and decodable POMDPs, for which it has been shown that statistically tractable learning is possible, but there has not been any computationally efficient algorithm. We first present an algorithm for decodable POMDPs that combines maximum likelihood estimation (MLE) and optimism in the face of uncertainty (OFU) to perform representation learning and achieve efficient sample complexity, while only calling supervised learning computational oracles. We then show how to adapt this algorithm to also work in the broader class of γ-observable POMDPs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/22/2021

Provably Efficient Representation Learning in Low-rank Markov Decision Processes

The success of deep reinforcement learning (DRL) is due to the power of ...
research
09/29/2022

Optimistic MLE – A Generic Model-based Algorithm for Partially Observable Sequential Decision Making

This paper introduces a simple efficient learning algorithms for general...
research
07/08/2023

Efficient Model-Free Exploration in Low-Rank MDPs

A major challenge in reinforcement learning is to develop practical, sam...
research
04/12/2023

Representation Learning with Multi-Step Inverse Kinematics: An Efficient and Optimal Approach to Rich-Observation RL

We study the design of sample-efficient algorithms for reinforcement lea...
research
02/14/2022

Provably Efficient Causal Model-Based Reinforcement Learning for Systematic Generalization

In the sequential decision making setting, an agent aims to achieve syst...
research
05/26/2022

Embed to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency

Reinforcement learning in partially observed Markov decision processes (...
research
06/07/2022

Learning in Observable POMDPs, without Computationally Intractable Oracles

Much of reinforcement learning theory is built on top of oracles that ar...

Please sign up or login with your details

Forgot password? Click here to reset