Making Linear MDPs Practical via Contrastive Representation Learning

07/14/2022
by   Tianjun Zhang, et al.
3

It is common to address the curse of dimensionality in Markov decision processes (MDPs) by exploiting low-rank representations. This motivates much of the recent theoretical study on linear MDPs. However, most approaches require a given representation under unrealistic assumptions about the normalization of the decomposition or introduce unresolved computational challenges in practice. Instead, we consider an alternative definition of linear MDPs that automatically ensures normalization while allowing efficient representation learning via contrastive estimation. The framework also admits confidence-adjusted index algorithms, enabling an efficient and principled approach to incorporating optimism or pessimism in the face of uncertainty. To the best of our knowledge, this provides the first practical representation learning method for linear MDPs that achieves both strong theoretical guarantees and empirical performance. Theoretically, we prove that the proposed algorithm is sample efficient in both the online and offline settings. Empirically, we demonstrate superior performance over existing state-of-the-art model-based and model-free algorithms on several benchmarks.

READ FULL TEXT
research
07/29/2022

Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning

In view of its power in extracting feature representation, contrastive s...
research
02/14/2021

Model-free Representation Learning and Exploration in Low-rank MDPs

The low rank MDP has emerged as an important model for studying represen...
research
07/08/2023

Efficient Model-Free Exploration in Low-Rank MDPs

A major challenge in reinforcement learning is to develop practical, sam...
research
08/10/2023

Provably Efficient Algorithm for Nonstationary Low-Rank MDPs

Reinforcement learning (RL) under changing environment models many real-...
research
06/08/2021

Learning Markov State Abstractions for Deep Reinforcement Learning

The fundamental assumption of reinforcement learning in Markov decision ...
research
11/22/2021

A Free Lunch from the Noise: Provable and Practical Exploration for Representation Learning

Representation learning lies at the heart of the empirical success of de...
research
07/01/2023

Provably Efficient UCB-type Algorithms For Learning Predictive State Representations

The general sequential decision-making problem, which includes Markov de...

Please sign up or login with your details

Forgot password? Click here to reset