Achieving Correlated Equilibrium by Studying Opponent's Behavior Through Policy-Based Deep Reinforcement Learning
Game theory is a very profound study on distributed decision-making behavior and has been extensively developed by many scholars. However, many existing works rely on certain strict assumptions such as knowing the opponent's private behaviors, which might not be practical. In this work, we focused on two Nobel winning concepts, the Nash equilibrium and the correlated equilibrium. Specifically, we successfully reached the correlated equilibrium outside the convex hull of the Nash equilibria with our proposed deep reinforcement learning algorithm. With the correlated equilibrium probability distribution, we also propose a mathematical model to inverse the calculation of the correlated equilibrium probability distribution to estimate the opponent's payoff vector. With those payoffs, deep reinforcement learning learns why and how the rational opponent plays, instead of just learning the regions for corresponding strategies and actions. Through simulations, we showed that our proposed method can achieve the optimal correlated equilibrium and outside the convex hull of the Nash equilibrium with limited interaction among players.
READ FULL TEXT