In Situ Answer Sentence Selection at Web-scale
Current answer sentence selection (AS2) applied in open-domain question answering (ODQA) selects answers by ranking a large set of possible candidates, i.e., sentences, extracted from the retrieved text. In this paper, we present Passage-based Extracting Answer Sentence In-place (PEASI), a novel design for AS2 optimized for Web-scale setting, that, instead, computes such answer without processing each candidate individually. Specifically, we design a Transformer-based framework that jointly (i) reranks passages retrieved for a question and (ii) identifies a probable answer from the top passages in place. We train PEASI in a multi-task learning framework that encourages feature sharing between the components: passage reranker and passage-based answer sentence extractor. To facilitate our development, we construct a new Web-sourced large-scale QA dataset consisting of 800,000+ labeled passages/sentences for 60,000+ questions. The experiments show that our proposed design effectively outperforms the current state-of-the-art setting for AS2, i.e., a point-wise model for ranking sentences independently, by 6.51 in accuracy, from 48.86 efficient in computing answer sentences, requiring only 20 compared to the standard setting, i.e., reranking all possible candidates. We believe the release of PEASI, both the dataset and our proposed design, can contribute to advancing the research and development in deploying question answering services at Web scale.
READ FULL TEXT