Fictitious play in zero-sum stochastic games

10/08/2020
by   Muhammed O. Sayin, et al.
0

We present fictitious play dynamics for the general class of stochastic games and analyze its convergence properties in zero-sum stochastic games. Our dynamics involves agents forming beliefs on opponent strategy and their own continuation payoff (Q-function), and playing a myopic best response using estimated continuation payoffs. Agents update their beliefs at states visited from observations of opponent actions. A key property of the learning dynamics is that update of the beliefs on Q-functions occurs at a slower timescale than update of the beliefs on strategies. We show both in the model-based and model-free cases (without knowledge of agent payoff functions and state transition probabilities), the beliefs on strategies converge to a stationary mixed Nash equilibrium of the zero-sum stochastic game.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/05/2021

Stochastic Multiplicative Weights Updates in Zero-Sum Games

We study agents competing against each other in a repeated network zero-...
research
04/09/2023

Higher-Order Uncoupled Dynamics Do Not Lead to Nash Equilibrium – Except When They Do

The framework of multi-agent learning explores the dynamics of how indiv...
research
09/20/2022

Mutual knowledge of rationality and correct beliefs in n-person games: An impossibility theorem

There are two well-known sufficient conditions for Nash equilibrium: com...
research
01/09/2019

Learning by Fictitious Play in Large Populations

We consider learning by fictitious play in a large population of agents ...
research
06/04/2021

Decentralized Q-Learning in Zero-sum Markov Games

We study multi-agent reinforcement learning (MARL) in infinite-horizon d...
research
02/20/2023

Efficient-Q Learning for Stochastic Games

We present the new efficient-Q learning dynamics for stochastic games be...
research
09/06/2023

Episodic Logit-Q Dynamics for Efficient Learning in Stochastic Teams

We present new learning dynamics combining (independent) log-linear lear...

Please sign up or login with your details

Forgot password? Click here to reset