Siamese Neural Networks with Random Forest for detecting duplicate question pairs
Determining whether two given questions are semantically similar is a fairly chal- lenging task given the different struc- tures and forms that the questions can take. In this paper, we use Gated Re- current Units(GRU) in combination with other highly used machine learning algo- rithms like Random Forest, Adaboost and SVM for the similarity prediction task on a dataset released by Quora, consisting of about 400k labeled question pairs. We got the best result by using the Siamese adaptation of a Bidirectional GRU with a Random Forest classifier, which landed us among the top 24 Quora Question Pairs hosted on Kaggle.
READ FULL TEXT