Bayesian active learning for production, a systematic study and a reusable library

by   Parmida Atighehchian, et al.

Active learning is able to reduce the amount of labelling effort by using a machine learning model to query the user for specific inputs. While there are many papers on new active learning techniques, these techniques rarely satisfy the constraints of a real-world project. In this paper, we analyse the main drawbacks of current active learning techniques and we present approaches to alleviate them. We do a systematic study on the effects of the most common issues of real-world datasets on the deep active learning process: model convergence, annotation error, and dataset imbalance. We derive two techniques that can speed up the active learning loop such as partial uncertainty sampling and larger query size. Finally, we present our open-source Bayesian active learning library, BaaL.


page 1

page 2

page 3

page 4


Deep Bayesian Active Learning, A Brief Survey on Recent Advances

Active learning frameworks offer efficient data annotation without remar...

Active Learning for Matching Problems

Effective learning of user preferences is critical to easing user burden...

Constraining the Parameters of High-Dimensional Models with Active Learning

Constraining the parameters of physical models with >5-10 parameters is ...

ED2: Two-stage Active Learning for Error Detection – Technical Report

Traditional error detection approaches require user-defined parameters a...

On weighted uncertainty sampling in active learning

This note explores probabilistic sampling weighted by uncertainty in act...

Are Good Explainers Secretly Human-in-the-Loop Active Learners?

Explainable AI (XAI) techniques have become popular for multiple use-cas...

An overview of active learning methods for insurance with fairness appreciation

This paper addresses and solves some challenges in the adoption of machi...

Please sign up or login with your details

Forgot password? Click here to reset