SC-PSRO: A Unified Strategy Learning Method for Normal-form Games

08/24/2023
by   Yudong Hu, et al.
0

Solving Nash equilibrium is the key challenge in normal-form games with large strategy spaces, wherein open-ended learning framework provides an efficient approach. Previous studies invariably employ diversity as a conduit to foster the advancement of strategies. Nevertheless, diversity-based algorithms can only work in zero-sum games with cyclic dimensions, which lead to limitations in their applicability. Here, we propose an innovative unified open-ended learning framework SC-PSRO, i.e., Self-Confirming Policy Space Response Oracle, as a general framework for both zero-sum and general-sum games. In particular, we introduce the advantage function as an improved evaluation metric for strategies, allowing for a unified learning objective for agents in normal-form games. Concretely, SC-PSRO comprises three quintessential components: 1) A Diversity Module, aiming to avoid strategies to be constrained by the cyclic structure. 2) A LookAhead Module, devised for the promotion of strategy in the transitive dimension. This module is theoretically guaranteed to learn strategies in the direction of the Nash equilibrium. 3) A Confirming-based Population Clipping Module, contrived for tackling the equilibrium selection problem in general-sum games. This module can be applied to learn equilibria with optimal rewards, which to our knowledge is the first improvement for general-sum games. Our experiments indicate that SC-PSRO accomplishes a considerable decrease in exploitability in zero-sum games and an escalation in rewards in general-sum games, markedly surpassing antecedent methodologies. Code will be released upon acceptance.

READ FULL TEXT
research
09/07/2019

Computing Stackelberg Equilibria of Large General-Sum Games

We study the computational complexity of finding Stackelberg Equilibria ...
research
02/02/2023

Learning in Multi-Memory Games Triggers Complex Dynamics Diverging from Nash Equilibrium

Repeated games consider a situation where multiple agents are motivated ...
research
03/14/2021

Modelling Behavioural Diversity for Learning in Open-Ended Games

Promoting behavioural diversity is critical for solving games with non-t...
research
07/31/2023

Block-Coordinate Methods and Restarting for Solving Extensive-Form Games

Coordinate descent methods are popular in machine learning and optimizat...
research
09/25/2020

Double Oracle Algorithm for Computing Equilibria in Continuous Games

Many efficient algorithms have been designed to recover Nash equilibria ...
research
11/16/2022

Some Properties of the Nash Equilibrium in 2 × 2 Zero-Sum Games

In this report, some properties of the set of Nash equilibria (NEs) of 2...
research
05/07/2018

What game are we playing? End-to-end learning in normal and extensive form games

Although recent work in AI has made great progress in solving large, zer...

Please sign up or login with your details

Forgot password? Click here to reset