Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games

03/05/2023
by   Yang Cai, et al.
0

We revisit the problem of learning in two-player zero-sum Markov games, focusing on developing an algorithm that is uncoupled, convergent, and rational, with non-asymptotic convergence rates. We start from the case of stateless matrix game with bandit feedback as a warm-up, showing an 𝒪(t^-1/8) last-iterate convergence rate. To the best of our knowledge, this is the first result that obtains finite last-iterate convergence rate given access to only bandit feedback. We extend our result to the case of irreducible Markov games, providing a last-iterate convergence rate of 𝒪(t^-1/9+ε) for any ε>0. Finally, we study Markov games without any assumptions on the dynamics, and show a path convergence rate, which is a new notion of convergence we defined, of 𝒪(t^-1/10). Our algorithm removes the synchronization and prior knowledge requirement of [Wei et al., 2021], which pursued the same goals as us for irreducible Markov games. Our algorithm is related to [Chen et al., 2021, Cen et al., 2021] and also builds on the entropy regularization technique. However, we remove their requirement of communications on the entropy values, making our algorithm entirely uncoupled.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/11/2021

Model-Free Learning for Two-Player Zero-Sum Partially Observable Markov Games with Perfect Recall

We study the problem of learning a Nash equilibrium (NE) in an imperfect...
research
09/26/2022

O(T^-1) Convergence of Optimistic-Follow-the-Regularized-Leader in Two-Player Zero-Sum Markov Games

We prove that optimistic-follow-the-regularized-leader (OFTRL), together...
research
09/01/2023

Local and adaptive mirror descents in extensive-form games

We study how to learn ϵ-optimal strategies in zero-sum imperfect informa...
research
10/30/2017

Convergence Rates of Latent Topic Models Under Relaxed Identifiability Conditions

In this paper we study the frequentist convergence rate for the Latent D...
research
04/10/2019

The operator approach to entropy games

Entropy games and matrix multiplication games have been recently introdu...
research
07/07/2022

A Comparison of Group Criticality Notions for Simple Games

We analyze two independent efforts to extend the notion of criticality i...
research
02/23/2014

Reciprocity in Gift-Exchange-Games

This paper presents an analysis of data from a gift-exchange-game experi...

Please sign up or login with your details

Forgot password? Click here to reset