Double Neural Counterfactual Regret Minimization

12/27/2018
by   Hui Li, et al.
0

Counterfactual Regret Minimization (CRF) is a fundamental and effective technique for solving Imperfect Information Games (IIG). However, the original CRF algorithm only works for discrete state and action spaces, and the resulting strategy is maintained as a tabular representation. Such tabular representation limits the method from being directly applied to large games and continuing to improve from a poor strategy profile. In this paper, we propose a double neural representation for the imperfect information games, where one neural network represents the cumulative regret, and the other represents the average strategy. Furthermore, we adopt the counterfactual regret minimization algorithm to optimize this double neural representation. To make neural learning efficient, we also developed several novel techniques including a robust sampling method, mini-batch Monte Carlo Counterfactual Regret Minimization (MCCFR) and Monte Carlo Counterfactual Regret Minimization Plus (MCCFR+) which may be of independent interests. Experimentally, we demonstrate that the proposed double neural algorithm converges significantly better than the reinforcement learning counterpart.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/04/2023

Pure Monte Carlo Counterfactual Regret Minimization

Counterfactual Regret Minimization (CFR) and its variants are the best a...
research
12/03/2020

Model-free Neural Counterfactual Regret Minimization with Bootstrap Learning

Counterfactual Regret Minimization (CFR) has achieved many fascinating r...
research
05/18/2021

CFR-MIX: Solving Imperfect Information Extensive-Form Games with Combinatorial Action Space

In many real-world scenarios, a team of agents coordinate with each othe...
research
05/27/2023

Hierarchical Deep Counterfactual Regret Minimization

Imperfect Information Games (IIGs) offer robust models for scenarios whe...
research
10/15/2021

Combining Counterfactual Regret Minimization with Information Gain to Solve Extensive Games with Imperfect Information

Counterfactual regret Minimization (CFR) is an effective algorithm for s...
research
12/18/2018

Monte Carlo Continual Resolving for Online Strategy Computation in Imperfect Information Games

Online game playing algorithms produce high-quality strategies with a fr...
research
01/22/2019

Single Deep Counterfactual Regret Minimization

Counterfactual Regret Minimization (CFR) is the most successful algorith...

Please sign up or login with your details

Forgot password? Click here to reset