Exponential Hardness of Reinforcement Learning with Linear Function Approximation

02/25/2023
by   Daniel Kane, et al.
0

A fundamental question in reinforcement learning theory is: suppose the optimal value functions are linear in given features, can we learn them efficiently? This problem's counterpart in supervised learning, linear regression, can be solved both statistically and computationally efficiently. Therefore, it was quite surprising when a recent work <cit.> showed a computational-statistical gap for linear reinforcement learning: even though there are polynomial sample-complexity algorithms, unless NP = RP, there are no polynomial time algorithms for this setting. In this work, we build on their result to show a computational lower bound, which is exponential in feature dimension and horizon, for linear reinforcement learning under the Randomized Exponential Time Hypothesis. To prove this we build a round-based game where in each round the learner is searching for an unknown vector in a unit hypercube. The rewards in this game are chosen such that if the learner achieves large reward, then the learner's actions can be used to simulate solving a variant of 3-SAT, where (a) each variable shows up in a bounded number of clauses (b) if an instance has no solutions then it also has no solutions that satisfy more than (1-ϵ)-fraction of clauses. We use standard reductions to show this 3-SAT variant is approximately as hard as 3-SAT. Finally, we also show a lower bound optimized for horizon dependence that almost matches the best known upper bound of exp(√(H)).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/11/2022

Computational-Statistical Gaps in Reinforcement Learning

Reinforcement learning with function approximation has recently achieved...
research
01/11/2023

Adversarial Online Multi-Task Reinforcement Learning

We consider the adversarial online multi-task reinforcement learning set...
research
12/17/2020

Hardness of Learning Halfspaces with Massart Noise

We study the complexity of PAC learning halfspaces in the presence of Ma...
research
06/23/2022

Nearly Minimax Optimal Reinforcement Learning with Linear Function Approximation

We study reinforcement learning with linear function approximation where...
research
10/25/2022

Bridging Distributional and Risk-sensitive Reinforcement Learning with Provable Regret Bounds

We study the regret guarantee for risk-sensitive reinforcement learning ...
research
02/11/2022

Rate-matching the regret lower-bound in the linear quadratic regulator with unknown dynamics

The theory of reinforcement learning currently suffers from a mismatch b...
research
04/02/2021

Linear Systems can be Hard to Learn

In this paper, we investigate when system identification is statisticall...

Please sign up or login with your details

Forgot password? Click here to reset