Block-Coordinate Methods and Restarting for Solving Extensive-Form Games

by   Darshan Chakrabarti, et al.

Coordinate descent methods are popular in machine learning and optimization for their simple sparse updates and excellent practical performance. In the context of large-scale sequential game solving, these same properties would be attractive, but until now no such methods were known, because the strategy spaces do not satisfy the typical separable block structure exploited by such methods. We present the first cyclic coordinate-descent-like method for the polytope of sequence-form strategies, which form the strategy spaces for the players in an extensive-form game (EFG). Our method exploits the recursive structure of the proximal update induced by what are known as dilated regularizers, in order to allow for a pseudo block-wise update. We show that our method enjoys a O(1/T) convergence rate to a two-player zero-sum Nash equilibrium, while avoiding the worst-case polynomial scaling with the number of blocks common to cyclic methods. We empirically show that our algorithm usually performs better than other state-of-the-art first-order methods (i.e., mirror prox), and occasionally can even beat CFR^+, a state-of-the-art algorithm for numerical equilibrium computation in zero-sum EFGs. We then introduce a restarting heuristic for EFG solving. We show empirically that restarting can lead to speedups, sometimes huge, both for our cyclic method, as well as for existing methods such as mirror prox and predictive CFR^+.


page 1

page 2

page 3

page 4


Better Regularization for Sequential Decision Spaces: Fast Convergence Rates for Nash, Correlated, and Team Equilibria

We study the application of iterative first-order methods to the problem...

Block-Cyclic Stochastic Coordinate Descent for Deep Neural Networks

We present a stochastic first-order optimization algorithm, named BCSC, ...

SC-PSRO: A Unified Strategy Learning Method for Normal-form Games

Solving Nash equilibrium is the key challenge in normal-form games with ...

Adaptive Stochastic Primal-Dual Coordinate Descent for Separable Saddle Point Problems

We consider a generic convex-concave saddle point problem with separable...

Theoretical and Practical Advances on Smoothing for Extensive-Form Games

Sparse iterative methods, in particular first-order methods, are known t...

Model-Free Learning for Two-Player Zero-Sum Partially Observable Markov Games with Perfect Recall

We study the problem of learning a Nash equilibrium (NE) in an imperfect...

Understanding Limitation of Two Symmetrized Orders by Worst-case Complexity

It was recently found that the standard version of multi-block cyclic AD...

Please sign up or login with your details

Forgot password? Click here to reset