A Deep Reinforcement Learning Approach for Finding Non-Exploitable Strategies in Two-Player Atari Games

07/18/2022
by   Zihan Ding, et al.
0

This paper proposes novel, end-to-end deep reinforcement learning algorithms for learning two-player zero-sum Markov games. Different from prior efforts on training agents to beat a fixed set of opponents, our objective is to find the Nash equilibrium policies that are free from exploitation by even the adversarial opponents. We propose (1) Nash DQN algorithm, which integrates DQN with a Nash finding subroutine for the joint value functions; and (2) Nash DQN Exploiter algorithm, which additionally adopts an exploiter for guiding agent's exploration. Our algorithms are the practical variants of theoretical algorithms which are guaranteed to converge to Nash equilibria in the basic tabular setting. Experimental evaluation on both tabular examples and two-player Atari games demonstrates the robustness of the proposed algorithms against adversarial opponents, as well as their advantageous performance over existing methods.

READ FULL TEXT

page 2

page 9

page 25

page 26

research
04/14/2023

Coarse Correlated Equilibrium Implies Nash Equilibrium in Two-Player Zero-Sum Games

We give a simple proof of the well-known result that the marginal strate...
research
06/07/2018

Re-evaluating evaluation

Progress in machine learning is measured by careful evaluation on proble...
research
06/15/2020

Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games

Finding approximate Nash equilibria in zero-sum imperfect-information ga...
research
01/11/2021

Solving Common-Payoff Games with Approximate Policy Iteration

For artificially intelligent learning systems to have widespread applica...
research
06/18/2020

DREAM: Deep Regret minimization with Advantage baselines and Model-free learning

We introduce DREAM, a deep reinforcement learning algorithm that finds o...
research
06/06/2022

Specification-Guided Learning of Nash Equilibria with High Social Welfare

Reinforcement learning has been shown to be an effective strategy for au...
research
04/18/2020

Achieving Correlated Equilibrium by Studying Opponent's Behavior Through Policy-Based Deep Reinforcement Learning

Game theory is a very profound study on distributed decision-making beha...

Please sign up or login with your details

Forgot password? Click here to reset