Objective Robustness in Deep Reinforcement Learning
We study objective robustness failures, a type of out-of-distribution robustness failure in reinforcement learning (RL). Objective robustness failures occur when an RL agent retains its capabilities off-distribution yet pursues the wrong objective. We provide the first explicit empirical demonstrations of objective robustness failures and argue that this type of failure is critical to address.
READ FULL TEXT 
  
  
     share
 share