research
          
      
      ∙
      03/29/2020
    A Decentralized Policy with Logarithmic Regret for a Class of Multi-Agent Multi-Armed Bandit Problems with Option Unavailability Constraints and Stochastic Communication Protoc
This paper considers a multi-armed bandit (MAB) problem in which multipl...
          
            research
          
      
      ∙
      10/07/2019
     
             
  
  
     
                             share
 share