Building Bridges: Viewing Active Learning from the Multi-Armed Bandit Lens

09/26/2013
by   Ravi Ganti, et al.
0

In this paper we propose a multi-armed bandit inspired, pool based active learning algorithm for the problem of binary classification. By carefully constructing an analogy between active learning and multi-armed bandits, we utilize ideas such as lower confidence bounds, and self-concordant regularization from the multi-armed bandit literature to design our proposed algorithm. Our algorithm is a sequential algorithm, which in each round assigns a sampling distribution on the pool, samples one point from this distribution, and queries the oracle for the label of this sampled point. The design of this sampling distribution is also inspired by the analogy between active learning and multi-armed bandits. We show how to derive lower confidence bounds required by our algorithm. Experimental comparisons to previously proposed active learning algorithms show superior performance on some standard UCI datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/07/2022

PAC-Bayesian Lifelong Learning For Multi-Armed Bandits

We present a PAC-Bayesian analysis of lifelong learning. In the lifelong...
research
09/29/2018

Dynamic Ensemble Active Learning: A Non-Stationary Bandit with Expert Advice

Active learning aims to reduce annotation cost by predicting which sampl...
research
01/25/2016

A Robust UCB Scheme for Active Learning in Regression from Strategic Crowds

We study the problem of training an accurate linear regression model by ...
research
06/30/2023

Thompson sampling for improved exploration in GFlowNets

Generative flow networks (GFlowNets) are amortized variational inference...
research
05/13/2021

Improved Algorithms for Agnostic Pool-based Active Classification

We consider active learning for binary classification in the agnostic po...
research
05/23/2022

Falsification of Multiple Requirements for Cyber-Physical Systems Using Online Generative Adversarial Networks and Multi-Armed Bandits

We consider the problem of falsifying safety requirements of Cyber-Physi...
research
07/06/2018

Combinatorial Bandits for Incentivizing Agents with Dynamic Preferences

The design of personalized incentives or recommendations to improve user...

Please sign up or login with your details

Forgot password? Click here to reset