The main challenge of offline reinforcement learning, where data is limi...
We focus on the task of approximating the optimal value function in deep...
We study the convergence behavior of the celebrated temporal-difference ...
Real-world deployment of machine learning models is challenging when dat...
We study task-agnostic continual reinforcement learning (TACRL) in which...
We employ Proximal Iteration for value-function optimization in reinforc...
Conditional quantile estimation is a key statistical learning challenge
...
Reliant on too many experiments to learn good actions, current Reinforce...
This paper prescribes a suite of techniques for off-policy Reinforcement...
Automated machine learning (AutoML) can produce complex model ensembles ...
We present TraDE, an attention-based architecture for auto-regressive de...
This paper introduces Meta-Q-Learning (MQL), a new off-policy algorithm ...
On-policy reinforcement learning (RL) algorithms have high sample comple...
Optimal selection of a subset of items from a given set is a hard proble...
Recent advances in spoken language technologies and the introduction of ...
For a speech-enhancement algorithm, it is highly desirable to simultaneo...
We present a method to improve video description generation by modeling
...