Behavior Planning For Connected Autonomous Vehicles Using Feedback Deep Reinforcement Learning
With the development of communication technologies, connected autonomous vehicles (CAVs) can share information with each other. Besides basic safety messages, they can also share their future plan. We propose a behavior planning method for CAVs to decide whether to change lane or keep lane based on the information received from neighbors and a policy learned by deep reinforcement learning (DRL). Our state design based on shared information is scalable to the number of vehicles. The proposed feedback deep Q-learning algorithms integrate the policy learning process with a continuous state space controller, which in turn gives feedback about actions and rewards to the learning process. We design both centralized and distributed DRL algorithms. In experiments, our behavior planning method can help increase traffic flow and driving comfort compared with a traditional rule-based control method. It also shows the distributed learning result is comparable to the centralized learning result, which reveals the possibility of improving the policy of behavior planning online. We also validate our algorithm in a more complicated scenario where there are two road closures on a freeway.
READ FULL TEXT