Agnostic Multi-Group Active Learning

by   Nick Rittler, et al.

Inspired by the problem of improving classification accuracy on rare or hard subsets of a population, there has been recent interest in models of learning where the goal is to generalize to a collection of distributions, each representing a “group”. We consider a variant of this problem from the perspective of active learning, where the learner is endowed with the power to decide which examples are labeled from each distribution in the collection, and the goal is to minimize the number of label queries while maintaining PAC-learning guarantees. Our main challenge is that standard active learning techniques such as disagreement-based active learning do not directly apply to the multi-group learning objective. We modify existing algorithms to provide a consistent active learning algorithm for an agnostic formulation of multi-group learning, which given a collection of G distributions and a hypothesis class ℋ with VC-dimension d, outputs an ϵ-optimal hypothesis using Õ( (ν^2/ϵ^2+1) G d θ_𝒢^2 log^2(1/ϵ) + Glog(1/ϵ)/ϵ^2 ) label queries, where θ_𝒢 is the worst-case disagreement coefficient over the collection. Roughly speaking, this guarantee improves upon the label complexity of standard multi-group learning in regimes where disagreement-based active learning algorithms may be expected to succeed, and the number of groups is not too large. We also consider the special case where each distribution in the collection is individually realizable with respect to ℋ, and demonstrate Õ( G d θ_𝒢log(1/ϵ) ) label queries are sufficient for learning in this case. We further give an approximation result for the full agnostic case inspired by the group realizable strategy.


page 1

page 2

page 3

page 4


Beyond Disagreement-based Agnostic Active Learning

We study agnostic active learning, where the goal is to learn a classifi...

Learning Time Dependent Choice

We explore questions dealing with the learnability of models of choice o...

Online Active Learning: Label Complexity vs. Classification Errors

We study online active learning for classifying streaming instances. At ...

Active Learning of Classifiers with Label and Seed Queries

We study exact active learning of binary and multiclass classifiers with...

Target-Independent Active Learning via Distribution-Splitting

To reduce the label complexity in Agnostic Active Learning (A^2 algorith...

A note on active learning for smooth problems

We show that the disagreement coefficient of certain smooth hypothesis c...

Exponential Savings in Agnostic Active Learning through Abstention

We show that in pool-based active classification without assumptions on ...

Please sign up or login with your details

Forgot password? Click here to reset