We study the autonomous exploration (AX) problem proposed by Lim Aue...
In contextual linear bandits, the reward function is assumed to be a lin...
Active learning with strong and weak labelers considers a practical sett...
We study the problem of representation learning in stochastic contextual...
We consider Contextual Bandits with Concave Rewards (CBCR), a multi-obje...
We study the sample complexity of learning an ϵ-optimal policy in
the St...
We consider a multi-armed bandit setting where, at the beginning of each...
Contextual bandit algorithms are widely used in domains where it is desi...
This paper studies privacy-preserving exploration in Markov Decision
Pro...
We introduce a generic strategy for provably efficient multi-goal
explor...
We study the role of the representation of state-action value functions ...
We derive a novel asymptotic problem-dependent lower-bound for regret
mi...
We study bandits and reinforcement learning (RL) subject to a conservati...
We study the problem of learning in the stochastic shortest path (SSP)
s...
The linear contextual bandit literature is mostly focused on the design ...
Contextual bandit is a general framework for online learning in sequenti...
We investigate the exploration of an unknown environment when no reward
...
In the contextual linear bandit setting, algorithms built on the optimis...
Reinforcement learning algorithms are widely used in domains where it is...
A common assumption in reinforcement learning (RL) is to have access to ...
We consider the problem of exploration-exploitation in communicating Mar...
In this work, we propose KeRNS: an algorithm for episodic reinforcement
...
We study the problem of learning exploration-exploitation strategies tha...
We consider the exploration-exploitation dilemma in finite-horizon
reinf...
We study the problem of efficient exploration in order to learn an accur...
In many sequential decision-making problems, the goal is to optimize a
u...
Contextual bandit algorithms are applied in a wide range of domains, fro...
In many fields such as digital marketing, healthcare, finance, and robot...
While learning in an unknown Markov Decision Process (MDP), an agent sho...
We investigate concentration inequalities for Dirichlet and Multinomial
...
In this work, we present an alternative approach to making an agent
comp...
Many popular reinforcement learning problems (e.g., navigation in a maze...
We consider the exploration-exploitation dilemma in finite-horizon
reinf...
Policy gradient algorithms are among the best candidates for the much
an...
We introduce and analyse two algorithms for exploration-exploitation in
...
While designing the state space of an MDP, it is common to include state...
In this paper, we propose a novel reinforcement- learning algorithm
cons...
We consider the transfer of experience samples (i.e., tuples < s, a, s',...
We introduce SCAL, an algorithm designed to perform efficient
exploratio...
In this paper, we propose a novel approach to automatically determine th...
This document contains supplementary material for the paper "Multi-objec...