Adversarial Domain Adaptation for Duplicate Question Detection

09/07/2018
by   Darsh J Shah, et al.
0

We address the problem of detecting duplicate questions in forums, which is an important step towards automating the process of answering new questions. As finding and annotating such potential duplicates manually is very tedious and costly, automatic methods based on machine learning are a viable alternative. However, many forums do not have annotated data, i.e., questions labeled by experts as duplicates, and thus a promising solution is to use domain adaptation from another forum that has such annotations. Here we focus on adversarial domain adaptation, deriving important findings about when it performs well and what properties of the domains are important in this regard. Our experiments with StackExchange data show an average improvement of 5.6 over the best baseline across multiple pairs of domains.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset