One of the most surprising puzzles in neural network generalisation is
g...
Circuit analysis is a promising technique for understanding the
internal...
To facilitate research in the direction of fine-tuning foundation models...
When robots learn reward functions using high capacity models that take ...
The field of AI alignment is concerned with AI systems that pursue unint...
Imitation learning often needs a large demonstration set in order to han...
We held the first-ever MineRL Benchmark for Agents that Solve Almost-Lif...
The last decade has seen a significant increase of interest in deep lear...
Since reward functions are hard to specify, recent work has focused on
l...
Given two sources of evidence about a latent variable, one can combine t...
Specifying reward functions for robots that operate in environments with...
In order for agents trained by deep reinforcement learning to work along...
Imitation Learning (IL) algorithms are typically evaluated in the same
e...
While we would like agents that can coordinate with humans, current
algo...
Our goal is for agents to optimize the right reward function, despite ho...
Reinforcement learning (RL) agents optimize only the features specified ...
Reward design, the problem of selecting an appropriate reward function f...