Online Preselection with Context Information under the Plackett-Luce Model

02/11/2020
by   Adil El Mesaoudi-Paul, et al.
0

We consider an extension of the contextual multi-armed bandit problem, in which, instead of selecting a single alternative (arm), a learner is supposed to make a preselection in the form of a subset of alternatives. More specifically, in each iteration, the learner is presented a set of arms and a context, both described in terms of feature vectors. The task of the learner is to preselect k of these arms, among which a final choice is made in a second step. In our setup, we assume that each arm has a latent (context-dependent) utility, and that feedback on a preselection is produced according to a Plackett-Luce model. We propose the CPPL algorithm, which is inspired by the well-known UCB algorithm, and evaluate this algorithm on synthetic and real data. In particular, we consider an online algorithm selection scenario, which served as a main motivation of our problem setting. Here, an instance (which defines the context) from a certain problem class (such as SAT) can be solved by different algorithms (the arms), but only k of these algorithms can actually be run.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/23/2020

Online Algorithm for Unsupervised Sequential Selection with Contextual Information

In this paper, we study Contextual Unsupervised Sequential Selection (US...
research
07/14/2020

Generic Outlier Detection in Multi-Armed Bandit

In this paper, we study the problem of outlier arm detection in multi-ar...
research
08/31/2021

Max-Utility Based Arm Selection Strategy For Sequential Query Recommendations

We consider the query recommendation problem in closed loop interactive ...
research
09/30/2021

Adapting Bandit Algorithms for Settings with Sequentially Available Arms

Although the classical version of the Multi-Armed Bandits (MAB) framewor...
research
10/01/2020

Unknown Delay for Adversarial Bandit Setting with Multiple Play

This paper addresses the problem of unknown delays in adversarial multi-...
research
04/25/2019

Learning to Detect an Odd Markov Arm

A multi-armed bandit with finitely many arms is studied when each arm is...
research
01/31/2015

Sparse Dueling Bandits

The dueling bandit problem is a variation of the classical multi-armed b...

Please sign up or login with your details

Forgot password? Click here to reset