Online Optimization Methods for the Quantification Problem

by   Purushottam Kar, et al.

The estimation of class prevalence, i.e., the fraction of a population that belongs to a certain class, is a very useful tool in data analytics and learning, and finds applications in many domains such as sentiment analysis, epidemiology, etc. For example, in sentiment analysis, the objective is often not to estimate whether a specific text conveys a positive or a negative sentiment, but rather estimate the overall distribution of positive and negative sentiments during an event window. A popular way of performing the above task, often dubbed quantification, is to use supervised learning to train a prevalence estimator from labeled data. Contemporary literature cites several performance measures used to measure the success of such prevalence estimators. In this paper we propose the first online stochastic algorithms for directly optimizing these quantification-specific performance measures. We also provide algorithms that optimize hybrid performance measures that seek to balance quantification and classification performance. Our algorithms present a significant advancement in the theory of multivariate optimization and we show, by a rigorous theoretical analysis, that they exhibit optimal convergence. We also report extensive experiments on benchmark and real data sets which demonstrate that our methods significantly outperform existing optimization techniques used for these performance measures.


page 1

page 2

page 3

page 4


SemEval-2016 Task 4: Sentiment Analysis in Twitter

This paper discusses the fourth year of the “Sentiment Analysis in Twitt...

Lex2Sent: A bagging approach to unsupervised sentiment analysis

Unsupervised sentiment analysis is traditionally performed by counting t...

Psychological State in Text: A Limitation of Sentiment Analysis

Starting with the idea that sentiment analysis models should be able to ...

INSIGHT-1 at SemEval-2016 Task 4: Convolutional Neural Networks for Sentiment Classification and Quantification

This paper describes our deep learning-based approach to sentiment analy...

SEnti-Analyzer: Joint Sentiment Analysis For Text-Based and Verbal Communication in Software Projects

Social aspects in software development teams are of particular importanc...

Optimizing Non-decomposable Performance Measures: A Tale of Two Classes

Modern classification problems frequently present mild to severe label i...

Evaluation Measures for Quantification: An Axiomatic Approach

Quantification is the task of estimating, given a set σ of unlabelled it...

Please sign up or login with your details

Forgot password? Click here to reset