Power-seeking behavior is a key source of risk from advanced AI, but our...
The field of AI alignment is concerned with AI systems that pursue unint...
How can we design agents that pursue a given objective when all feedback...
This paper describes REALab, a platform for embedded agency research in
...
Designing reward functions is difficult: the designer has to specify wha...
Proposals for safe AGI systems are typically made at the level of framew...
How can we design reinforcement learning agents that avoid causing
unnec...
We present a suite of reinforcement learning environments illustrating
v...
No real-world reward function is perfect. Sensory errors and software bu...