To Phrase or Not to Phrase - Impact of User versus System Term Dependence Upon Retrieval

02/07/2018
by   Christina Lioma, et al.
0

When submitting queries to information retrieval (IR) systems, users often have the option of specifying which, if any, of the query terms are heavily dependent on each other and should be treated as a fixed phrase, for instance by placing them between quotes. In addition to such cases where users specify term dependence, automatic ways also exist for IR systems to detect dependent terms in queries. Most IR systems use both user and algorithmic approaches. It is not however clear whether and to what extent user-defined term dependence agrees with algorithmic estimates of term dependence, nor which of the two may fetch higher performance gains. Simply put, is it better to trust users or the system to detect term dependence in queries? To answer this question, we experiment with 101 crowdsourced search engine users and 334 queries (52 train and 282 test TREC queries) and we record 10 assessments per query. We find that (i) user assessments of term dependence differ significantly from algorithmic assessments of term dependence (their overlap is approximately 30 is little agreement among users about term dependence in queries, and this disagreement increases as queries become longer; (iii) the potential retrieval gain that can be fetched by treating term dependence (both user- and system-defined) over a bag of words baseline is reserved to a small subset (approxi-mately 8 preci-sion measures. Points (ii) and (iii) constitute novel insights into term dependence.

READ FULL TEXT
research
02/20/2023

Query Performance Prediction for Neural IR: Are We There Yet?

Evaluation in Information Retrieval relies on post-hoc empirical procedu...
research
09/07/2018

Term-Mouse-Fixations as an Additional Indicator for Topical User Interests in Domain-Specific Search

Models in Interactive Information Retrieval (IIR) are grounded very much...
research
11/25/2021

Evaluating the Robustness of Retrieval Pipelines with Query Variation Generators

Heavily pre-trained transformers for language modelling, such as BERT, h...
research
05/31/2022

Interactive Query Clarification and Refinement via User Simulation

When users initiate search sessions, their queries are often unclear or ...
research
05/29/2023

Adapting Learned Sparse Retrieval for Long Documents

Learned sparse retrieval (LSR) is a family of neural retrieval methods t...
research
01/19/2022

Validating Simulations of User Query Variants

System-oriented IR evaluations are limited to rather abstract understand...

Please sign up or login with your details

Forgot password? Click here to reset