A Zeroth-Order Momentum Method for Risk-Averse Online Convex Games

09/06/2022
by   Zifan Wang, et al.
0

We consider risk-averse learning in repeated unknown games where the goal of the agents is to minimize their individual risk of incurring significantly high cost. Specifically, the agents use the conditional value at risk (CVaR) as a risk measure and rely on bandit feedback in the form of the cost values of the selected actions at every episode to estimate their CVaR values and update their actions. A major challenge in using bandit feedback to estimate CVaR is that the agents can only access their own cost values, which, however, depend on the actions of all agents. To address this challenge, we propose a new risk-averse learning algorithm with momentum that utilizes the full historical information on the cost values. We show that this algorithm achieves sub-linear regret and matches the best known algorithms in the literature. We provide numerical experiments for a Cournot game that show that our method outperforms existing methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/16/2022

Risk-Averse No-Regret Learning in Online Convex Games

We consider an online stochastic game with risk-averse agents whose goal...
research
10/01/2018

Risk-Averse Stochastic Convex Bandit

Motivated by applications in clinical trials and finance, we study the p...
research
05/18/2021

CFR-MIX: Solving Imperfect Information Extensive-Form Games with Combinatorial Action Space

In many real-world scenarios, a team of agents coordinate with each othe...
research
09/15/2022

Risk-aware linear bandits with convex loss

In decision-making problems such as the multi-armed bandit, an agent lea...
research
10/03/2018

Bandit learning in concave N-person games

This paper examines the long-run behavior of learning with bandit feedba...
research
05/15/2022

Sample-Efficient Learning of Correlated Equilibria in Extensive-Form Games

Imperfect-Information Extensive-Form Games (IIEFGs) is a prevalent model...
research
11/10/2021

Multi-Agent Learning for Iterative Dominance Elimination: Formal Barriers and New Algorithms

Dominated actions are natural (and perhaps the simplest possible) multi-...

Please sign up or login with your details

Forgot password? Click here to reset