research
∙
03/29/2020
A Decentralized Policy with Logarithmic Regret for a Class of Multi-Agent Multi-Armed Bandit Problems with Option Unavailability Constraints and Stochastic Communication Protoc
This paper considers a multi-armed bandit (MAB) problem in which multipl...
research
∙
10/07/2019
An Option and Agent Selection Policy with Logarithmic Regret for Multi Agent Multi Armed Bandit Problems on Random Graphs
Existing studies of the Multi Agent Multi Armed Bandit (MAMAB) problem, ...
research
∙
10/05/2017