Fast Policy Extragradient Methods for Competitive Games with Entropy Regularization

05/31/2021
by   Shicong Cen, et al.
0

This paper investigates the problem of computing the equilibrium of competitive games, which is often modeled as a constrained saddle-point optimization problem with probability simplex constraints. Despite recent efforts in understanding the last-iterate convergence of extragradient methods in the unconstrained setting, the theoretical underpinnings of these methods in the constrained settings, especially those using multiplicative updates, remain highly inadequate, even when the objective function is bilinear. Motivated by the algorithmic role of entropy regularization in single-agent reinforcement learning and game theory, we develop provably efficient extragradient methods to find the quantal response equilibrium (QRE) – which are solutions to zero-sum two-player matrix games with entropy regularization – at a linear rate. The proposed algorithms can be implemented in a decentralized manner, where each player executes symmetric and multiplicative updates iteratively using its own payoff without observing the opponent's actions directly. In addition, by controlling the knob of entropy regularization, the proposed algorithms can locate an approximate Nash equilibrium of the unregularized matrix game at a sublinear rate without assuming the Nash equilibrium to be unique. Our methods also lead to efficient policy extragradient algorithms for solving entropy-regularized zero-sum Markov games at a linear rate. All of our convergence rates are nearly dimension-free, which are independent of the size of the state and action spaces up to logarithm factors, highlighting the positive role of entropy regularization for accelerating convergence.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/03/2022

Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov Games

Multi-Agent Reinforcement Learning (MARL) – where multiple agents learn ...
research
11/16/2022

Asynchronous Gradient Play in Zero-Sum Multi-agent Games

Finding equilibria via gradient play in competitive multi-agent games ha...
research
04/12/2022

Independent Natural Policy Gradient Methods for Potential Games: Finite-time Global Convergence with Entropy Regularization

A major challenge in multi-agent systems is that the system complexity g...
research
05/31/2019

Policy Optimization Provably Converges to Nash Equilibria in Zero-Sum Linear Quadratic Games

We study the global convergence of policy optimization for finding the N...
research
06/17/2020

Competitive Mirror Descent

Constrained competitive optimization involves multiple agents trying to ...
research
05/27/2021

Better Regularization for Sequential Decision Spaces: Fast Convergence Rates for Nash, Correlated, and Team Equilibria

We study the application of iterative first-order methods to the problem...
research
11/15/2021

ZERO: Playing Mathematical Programming Games

We present ZERO, a modular and extensible C++ library interfacing Mathem...

Please sign up or login with your details

Forgot password? Click here to reset