Consensus Multiplicative Weights Update: Learning to Learn using Projector-based Game Signatures

by   Nelson Vadori, et al.

Recently, Optimistic Multiplicative Weights Update (OMWU) was proven to be the first constant step-size algorithm in the online no-regret framework to enjoy last-iterate convergence to Nash Equilibria in the constrained zero-sum bimatrix case, where weights represent the probabilities of playing pure strategies. We introduce the second such algorithm, Consensus MWU, for which we prove local convergence and show empirically that it enjoys faster and more robust convergence than OMWU. Our algorithm shows the importance of a new object, the simplex Hessian, as well as of the interaction of the game with the (eigen)space of vectors summing to zero, which we believe future research can build on. As for OMWU, CMWU has convergence guarantees in the zero-sum case only, but Cheung and Piliouras (2020) recently showed that OMWU and MWU display opposite convergence properties depending on whether the game is zero-sum or cooperative. Inspired by this work and the recent literature on learning to optimize for single functions, we extend CMWU to non zero-sum games by introducing a new framework for online learning in games, where the update rule's gradient and Hessian coefficients along a trajectory are learnt by a reinforcement learning policy that is conditioned on the nature of the game: the game signature. We construct the latter using a new canonical decomposition of two-player games into eight components corresponding to commutative projection operators, generalizing and unifying recent game concepts studied in the literature. We show empirically that our new learning policy is able to exploit the game signature across a wide range of game types.


page 8

page 22

page 24

page 25

page 26

page 27

page 30

page 31


Stochastic Multiplicative Weights Updates in Zero-Sum Games

We study agents competing against each other in a repeated network zero-...

Vortices Instead of Equilibria in MinMax Optimization: Chaos and Butterfly Effects of Online Learning in Zero-Sum Games

We establish that algorithmic experiments in zero-sum games "fail misera...

Policy Optimization for Markov Games: Unified Framework and Faster Convergence

This paper studies policy optimization algorithms for multi-agent reinfo...

Fast Convergence of Optimistic Gradient Ascent in Network Zero-Sum Extensive Form Games

The study of learning in games has thus far focused primarily on normal ...

EigenGame Unloaded: When playing games is better than optimizing

We build on the recently proposed EigenGame that views eigendecompositio...

Matrix Multiplicative Weights Updates in Quantum Zero-Sum Games: Conservation Laws Recurrence

Recent advances in quantum computing and in particular, the introduction...

Chaos of Learning Beyond Zero-sum and Coordination via Game Decompositions

Machine learning processes, e.g. ”learning in games”, can be viewed as n...

Please sign up or login with your details

Forgot password? Click here to reset