We introduce a generic strategy for provably efficient multi-goal
explor...
We propose UCBMQ, Upper Confidence Bound Momentum Q-learning, a new algo...
In this paper, we propose new problem-independent lower bounds on the sa...
Realistic environments often provide agents with very limited feedback. ...
In this work, we propose KeRNS: an algorithm for episodic reinforcement
...
Reward-free exploration is a reinforcement learning setting recently stu...
We propose MDP-GapE, a new trajectory-based Monte-Carlo Tree Search algo...
We consider the exploration-exploitation dilemma in finite-horizon
reinf...