Data Selection Strategies for Multi-Domain Sentiment Analysis

02/08/2017
by   Sebastian Ruder, et al.
0

Domain adaptation is important in sentiment analysis as sentiment-indicating words vary between domains. Recently, multi-domain adaptation has become more pervasive, but existing approaches train on all available source domains including dissimilar ones. However, the selection of appropriate training data is as important as the choice of algorithm. We undertake -- to our knowledge for the first time -- an extensive study of domain similarity metrics in the context of sentiment analysis and propose novel representations, metrics, and a new scope for data selection. We evaluate the proposed methods on two large-scale multi-domain adaptation settings on tweets and reviews and demonstrate that they consistently outperform strong random and balanced baselines, while our proposed selection strategy outperforms instance-level selection and yields the best score on a large reviews corpus.

READ FULL TEXT

page 4

page 7

research
06/12/2018

Projecting Embeddings for Domain Adaptation: Joint Modeling of Sentiment Analysis in Diverse Domains

Domain adaptation for sentiment analysis is challenging due to the fact ...
research
07/29/2019

Learning Invariant Representations for Sentiment Analysis: The Missing Material is Datasets

Learning representations which remain invariant to a nuisance factor has...
research
04/25/2018

Strong Baselines for Neural Semi-supervised Learning under Domain Shift

Novel neural models have been proposed in recent years for learning unde...
research
04/05/2019

Distinguishing Clinical Sentiment: The Importance of Domain Adaptation in Psychiatric Patient Health Records

Recently natural language processing (NLP) tools have been developed to ...
research
03/20/2022

From Stance to Concern: Adaptation of Propositional Analysis to New Tasks and Domains

We present a generalized paradigm for adaptation of propositional analys...
research
05/08/2022

Multi-Domain Targeted Sentiment Analysis

Targeted Sentiment Analysis (TSA) is a central task for generating insig...
research
07/17/2017

Learning to select data for transfer learning with Bayesian Optimization

Domain similarity measures can be used to gauge adaptability and select ...

Please sign up or login with your details

Forgot password? Click here to reset