research
          
      
      ∙
      03/29/2020
    A Decentralized Policy with Logarithmic Regret for a Class of Multi-Agent Multi-Armed Bandit Problems with Option Unavailability Constraints and Stochastic Communication Protoc
This paper considers a multi-armed bandit (MAB) problem in which multipl...
          
            research
          
      
      ∙
      10/07/2019
    An Option and Agent Selection Policy with Logarithmic Regret for Multi Agent Multi Armed Bandit Problems on Random Graphs
Existing studies of the Multi Agent Multi Armed Bandit (MAMAB) problem, ...
          
            research
          
      
      ∙
      10/05/2017
     
             
  
  
     
                             share
 share