Wizard of Search Engine: Access to Information Through Conversations with Search Engines

by   Pengjie Ren, et al.

Conversational information seeking (CIS) is playing an increasingly important role in connecting people to information. Due to the lack of suitable resource, previous studies on CIS are limited to the study of theoretical/conceptual frameworks, laboratory-based user studies, or a particular aspect of CIS (e.g., asking clarifying questions). In this work, we make efforts to facilitate research on CIS from three aspects. (1) We formulate a pipeline for CIS with six sub-tasks: intent detection (ID), keyphrase extraction (KE), action prediction (AP), query selection (QS), passage selection (PS), and response generation (RG). (2) We release a benchmark dataset, called wizard of search engine (WISE), which allows for comprehensive and in-depth research on all aspects of CIS. (3) We design a neural architecture capable of training and evaluating both jointly and separately on the six sub-tasks, and devise a pre-train/fine-tune learning scheme, that can reduce the requirements of WISE in scale by making full use of available data. We report some useful characteristics of CIS based on statistics of WISE. We also show that our best performing model variant isable to achieve effective CIS as indicated by several metrics. We release the dataset, the code, as well as the evaluation scripts to facilitate future research by measuring further improvements in this important research direction.


page 1

page 2

page 3

page 4


Conversations with Search Engines

In this paper, we address the problem of answering complex information n...

User Intent Prediction in Information-seeking Conversations

Conversational assistants are being progressively adopted by the general...

Introducing MANtIS: a novel Multi-Domain Information Seeking Dialogues Dataset

Conversational search is an approach to information retrieval (IR), wher...

A Framework for Evaluating Snippet Generation for Dataset Search

Reusing existing datasets is of considerable significance to researchers...

Data Splits and Metrics for Method Benchmarking on Surgical Action Triplet Datasets

In addition to generating data and annotations, devising sensible data s...

Are Pretrained Transformers Robust in Intent Classification? A Missing Ingredient in Evaluation of Out-of-Scope Intent Detection

Pretrained Transformer-based models were reported to be robust in intent...

Please sign up or login with your details

Forgot password? Click here to reset