Autonomous Driving using Safe Reinforcement Learning by Incorporating a Regret-based Human Lane-Changing Decision Model
It is expected that many human drivers will still prefer to drive themselves even if the self-driving technologies are ready. Therefore, human-driven vehicles and autonomous vehicles (AVs) will coexist in a mixed traffic for a long time. To enable AVs to safely and efficiently maneuver in this mixed traffic, it is critical that the AVs can understand how humans cope with risks and make driving-related decisions. On the other hand, the driving environment is highly dynamic and ever-changing, and it is thus difficult to enumerate all the scenarios and hard-code the controllers. To face up these challenges, in this work, we incorporate a human decision-making model in reinforcement learning to control AVs for safe and efficient operations. Specifically, we adapt regret theory to describe a human driver's lane-changing behavior, and fit the personalized models to individual drivers for predicting their lane-changing decisions. The predicted decisions are incorporated in the safety constraints for reinforcement learning in training and in implementation. We then use an extended version of double deep Q-network (DDQN) to train our AV controller within the safety set. By doing so, the amount of collisions in training is reduced to zero, while the training accuracy is not impinged.
READ FULL TEXT