DoCoGen: Domain Counterfactual Generation for Low Resource Domain Adaptation

02/24/2022
by   Nitay Calderon, et al.
0

Natural language processing (NLP) algorithms have become very successful, but they still struggle when applied to out-of-distribution examples. In this paper we propose a controllable generation approach in order to deal with this domain adaptation (DA) challenge. Given an input text example, our DoCoGen algorithm generates a domain-counterfactual textual example (D-con) - that is similar to the original in all aspects, including the task label, but its domain is changed to a desired one. Importantly, DoCoGen is trained using only unlabeled examples from multiple domains - no NLP task labels or parallel pairs of textual examples and their domain-counterfactuals are required. We show that DoCoGen can generate coherent counterfactuals consisting of multiple sentences. We use the D-cons generated by DoCoGen to augment a sentiment classifier and a multi-label intent classifier in 20 and 78 DA setups, respectively, where source-domain labeled data is scarce. Our model outperforms strong baselines and improves the accuracy of a state-of-the-art unsupervised DA algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/24/2021

PADA: A Prompt-based Autoregressive Approach for Adaptation to Unseen Domains

Natural Language Processing algorithms have made incredible progress rec...
research
11/07/2020

Interventional Domain Adaptation

Domain adaptation (DA) aims to transfer discriminative features learned ...
research
06/01/2022

IDANI: Inference-time Domain Adaptation via Neuron-level Interventions

Large pre-trained models are usually fine-tuned on downstream task data,...
research
03/27/2022

Example-based Hypernetworks for Out-of-Distribution Generalization

While Natural Language Processing (NLP) algorithms keep reaching unprece...
research
05/04/2023

ReMask: A Robust Information-Masking Approach for Domain Counterfactual Generation

Domain shift is a big challenge in NLP, thus, many approaches resort to ...
research
12/30/2022

TA-DA: Topic-Aware Domain Adaptation for Scientific Keyphrase Identification and Classification (Student Abstract)

Keyphrase identification and classification is a Natural Language Proces...
research
09/22/2020

My Health Sensor, my Classifier: Adapting a Trained Classifier to Unlabeled End-User Data

In this work, we present an approach for unsupervised domain adaptation ...

Please sign up or login with your details

Forgot password? Click here to reset