Perspectives on Large Language Models for Relevance Judgment

04/13/2023
by   Guglielmo Faggioli, et al.
0

When asked, current large language models (LLMs) like ChatGPT claim that they can assist us with relevance judgments. Many researchers think this would not lead to credible IR research. In this perspective paper, we discuss possible ways for LLMs to assist human experts along with concerns and issues that arise. We devise a human-machine collaboration spectrum that allows categorizing different relevance judgment strategies, based on how much the human relies on the machine. For the extreme point of "fully automated assessment", we further include a pilot experiment on whether LLM-based relevance judgments correlate with judgments from trained human assessors. We conclude the paper by providing two opposing perspectives - for and against the use of LLMs for automatic relevance judgments - and a compromise perspective, informed by our analyses of the literature, our preliminary experimental evidence, and our experience as IR researchers. We hope to start a constructive discussion within the community to avoid a stale-mate during review, where work is dammed if is uses LLMs for evaluation and dammed if it doesn't.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/08/2022

Relevance Judgment Convergence Degree – A Measure of Inconsistency among Assessors for Information Retrieval

Relevance judgment of human assessors is inherently subjective and dynam...
research
03/27/2019

Graded Relevance Assessments and Graded Relevance Measures of NTCIR: A Survey of the First Twenty Years

NTCIR was the first large-scale IR evaluation conference to construct te...
research
07/17/2023

Mini-Giants: "Small" Language Models and Open Source Win-Win

ChatGPT is phenomenal. However, it is prohibitively expensive to train a...
research
07/09/2023

Shaping the Emerging Norms of Using Large Language Models in Social Computing Research

The emergence of Large Language Models (LLMs) has brought both excitemen...
research
03/04/2023

Could a Large Language Model be Conscious?

There has recently been widespread discussion of whether large language ...
research
01/12/2023

Taking Search to Task

The importance of tasks in information retrieval (IR) has been long argu...
research
01/20/2022

Lensing Machines: Representing Perspective in Latent Variable Models

Many datasets represent a combination of different ways of looking at th...

Please sign up or login with your details

Forgot password? Click here to reset