Match^2: A Matching over Matching Model for Similar Question Identification

by   Zizhen Wang, et al.

Community Question Answering (CQA) has become a primary means for people to acquire knowledge, where people are free to ask questions or submit answers. To enhance the efficiency of the service, similar question identification becomes a core task in CQA which aims to find a similar question from the archived repository whenever a new question is asked. However, it has long been a challenge to properly measure the similarity between two questions due to the inherent variation of natural language, i.e., there could be different ways to ask a same question or different questions sharing similar expressions. To alleviate this problem, it is natural to involve the existing answers for the enrichment of the archived questions. Traditional methods typically take a one-side usage, which leverages the answer as some expanded representation of the corresponding question. Unfortunately, this may introduce unexpected noises into the similarity computation since answers are often long and diverse, leading to inferior performance. In this work, we propose a two-side usage, which leverages the answer as a bridge of the two questions. The key idea is based on our observation that similar questions could be addressed by similar parts of the answer while different questions may not. In other words, we can compare the matching patterns of the two questions over the same answer to measure their similarity. In this way, we propose a novel matching over matching model, namely Match^2, which compares the matching patterns between two question-answer pairs for similar question identification. Empirical experiments on two benchmark datasets demonstrate that our model can significantly outperform previous state-of-the-art methods on the similar question identification task.


Effective Transfer Learning for Identifying Similar Questions: Matching User Questions to COVID-19 FAQs

People increasingly search online for answers to their medical questions...

Answer Identification in Collaborative Organizational Group Chat

We present a simple unsupervised approach for answer identification in o...

GSSF: A Generative Sequence Similarity Function based on a Seq2Seq model for clustering online handwritten mathematical answers

Toward a computer-assisted marking for descriptive math questions,this p...

Group Sparse CNNs for Question Classification with Answer Sets

Question classification is an important task with wide applications. How...

Adversarial Training for Community Question Answer Selection Based on Multi-scale Matching

Community-based question answering (CQA) websites represent an important...

Want a Good Answer? Ask a Good Question First!

Community Question Answering (CQA) websites have become valuable reposit...

Domain-Relevant Embeddings for Medical Question Similarity

The rate at which medical questions are asked online significantly excee...

Please sign up or login with your details

Forgot password? Click here to reset