Bilinear Classes: A Structural Framework for Provable Generalization in RL

03/19/2021
by   Simon S. Du, et al.
52

This work introduces Bilinear Classes, a new structural framework, which permit generalization in reinforcement learning in a wide variety of settings through the use of function approximation. The framework incorporates nearly all existing models in which a polynomial sample complexity is achievable, and, notably, also includes new models, such as the Linear Q^*/V^* model in which both the optimal Q-function and the optimal V-function are linear in some known feature space. Our main result provides an RL algorithm which has polynomial sample complexity for Bilinear Classes; notably, this sample complexity is stated in terms of a reduction to the generalization error of an underlying supervised learning sub-problem. These bounds nearly match the best known sample complexity bounds for existing models. Furthermore, this framework also extends to the infinite dimensional (RKHS) setting: for the the Linear Q^*/V^* model, linear MDPs, and linear mixture MDPs, we provide sample complexities that have no explicit dependence on the explicit feature dimension (which could be infinite), but instead depends only on information theoretic quantities.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/30/2022

A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning

With the increasing need for handling large state and action spaces, gen...
research
03/05/2022

Target Network and Truncation Overcome The Deadly triad in Q-Learning

Q-learning with function approximation is one of the most empirically su...
research
02/01/2021

Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms

Finding the minimal structural assumptions that empower sample-efficient...
research
11/13/2022

Near-Linear Sample Complexity for L_p Polynomial Regression

We study L_p polynomial regression. Given query access to a function f:[...
research
05/17/2021

Sample-Efficient Reinforcement Learning Is Feasible for Linearly Realizable MDPs with Limited Revisiting

Low-complexity models such as linear function representation play a pivo...
research
10/16/2021

Online Target Q-learning with Reverse Experience Replay: Efficiently finding the Optimal Policy for Linear MDPs

Q-learning is a popular Reinforcement Learning (RL) algorithm which is w...
research
04/27/2018

Scalable Bilinear π Learning Using State and Action Features

Approximate linear programming (ALP) represents one of the major algorit...

Please sign up or login with your details

Forgot password? Click here to reset