A Dataset of Information-Seeking Questions and Answers Anchored in Research Papers

05/07/2021
by   Pradeep Dasigi, et al.
21

Readers of academic research papers often read with the goal of answering specific questions. Question Answering systems that can answer those questions can make consumption of the content much more efficient. However, building such tools requires data that reflect the difficulty of the task arising from complex reasoning about claims made in multiple parts of a paper. In contrast, existing information-seeking question answering datasets usually contain questions about generic factoid-type information. We therefore present QASPER, a dataset of 5,049 questions over 1,585 Natural Language Processing papers. Each question is written by an NLP practitioner who read only the title and abstract of the corresponding paper, and the question seeks information present in the full text. The questions are then answered by a separate set of NLP practitioners who also provide supporting evidence to answers. We find that existing models that do well on other QA tasks do not perform well on answering these questions, underperforming humans by at least 27 F1 points when answering them from entire papers, motivating further research in document-grounded, information-seeking QA, which our dataset is designed to facilitate.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/30/2022

CREPE: Open-Domain Question Answering with False Presuppositions

Information seeking users often pose questions with false presupposition...
research
04/04/2020

Talk to Papers: Bringing Neural Question Answering to Academic Search

We introduce Talk to Papers, which exploits the recent open-domain quest...
research
10/22/2020

Challenges in Information Seeking QA:Unanswerable Questions and Paragraph Retrieval

Recent progress in pretrained language model "solved" many reading compr...
research
12/16/2021

QuALITY: Question Answering with Long Input Texts, Yes!

To enable building and testing models on long-document comprehension, we...
research
02/28/2022

Paper Plain: Making Medical Research Papers Approachable to Healthcare Consumers with Natural Language Processing

When seeking information not covered in patient-friendly documents, like...
research
01/10/2018

MilkQA: a Dataset of Consumer Questions for the Task of Answer Selection

We introduce MilkQA, a question answering dataset from the dairy domain ...
research
10/19/2020

Question Generation for Supporting Informational Query Intents

Users frequently ask simple factoid questions when encountering question...

Please sign up or login with your details

Forgot password? Click here to reset