research
∙
11/12/2022
The Expertise Problem: Learning from Specialized Feedback
Reinforcement learning from human feedback (RLHF) is a powerful techniqu...
research
∙
07/17/2022