Reinforcement learning from human feedback (RLHF) is effective at aligni...
Consider the exploration in sparse-reward or reward-free environments, s...
Exploration is essential for solving complex Reinforcement Learning (RL)...
Credit assignment in reinforcement learning is the problem of measuring ...
We consider the problem of efficient credit assignment in reinforcement
...
In the past few years, deep learning has transformed artificial intellig...
The biological plausibility of the backpropagation algorithm has long be...
In machine learning, error back-propagation in multi-layer neural networ...
We introduce a weight update formula that is expressed only in terms of
...