Learning to Play against Any Mixture of Opponents

09/29/2020
by   Max Olan Smith, et al.
0

Intuitively, experience playing against one mixture of opponents in a given domain should be relevant for a different mixture in the same domain. We propose a transfer learning method, Q-Mixing, that starts by learning Q-values against each pure-strategy opponent. Then a Q-value for any distribution of opponent strategies is approximated by appropriately averaging the separately learned Q-values. From these components, we construct policies against all opponent mixtures without any further training. We empirically validate Q-Mixing in two environments: a simple grid-world soccer environment, and a complicated cyber-security game. We find that Q-Mixing is able to successfully transfer knowledge across any mixture of opponents. We next consider the use of observations during play to update the believed distribution of opponents. We introduce an opponent classifier—trained in parallel to Q-learning, using the same data—and use the classifier results to refine the mixing of Q-values. We find that Q-Mixing augmented with the opponent classifier function performs comparably, and with lower variance, than training directly against a mixed-strategy opponent.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/26/2023

Asymptotic tail properties of Poisson mixture distributions

Count data are omnipresent in many applied fields, often with overdisper...
research
11/30/2019

Dis-entangling Mixture of Interventions on a Causal Bayesian Network Using Aggregate Observations

We study the problem of separating a mixture of distributions, all of wh...
research
03/30/2019

Prediction Model for Semitransparent Watercolor Pigment Mixtures Using Deep Learning with a Dataset of Transmittance and Reflectance

Learning color mixing is difficult for novice painters. In order to supp...
research
06/03/2021

Iterative Empirical Game Solving via Single Policy Best Response

Policy-Space Response Oracles (PSRO) is a general algorithmic framework ...
research
04/12/1999

Mixing Metaphors

Mixed metaphors have been neglected in recent metaphor research. This pa...
research
05/17/2023

Infinite Class Mixup

Mixup is a widely adopted strategy for training deep networks, where add...
research
10/20/2020

Automatic multitrack mixing with a differentiable mixing console of neural audio effects

Applications of deep learning to automatic multitrack mixing are largely...

Please sign up or login with your details

Forgot password? Click here to reset