Studying the Effects of Cognitive Biases in Evaluation of Conversational Agents

02/18/2020
by   Sashank Santhanam, et al.
0

Humans quite frequently interact with conversational agents. The rapid advancement in generative language modeling through neural networks has helped advance the creation of intelligent conversational agents. Researchers typically evaluate the output of their models through crowdsourced judgments, but there are no established best practices for conducting such studies. Moreover, it is unclear if cognitive biases in decision-making are affecting crowdsourced workers' judgments when they undertake these tasks. To investigate, we conducted a between-subjects study with 77 crowdsourced workers to understand the role of cognitive biases, specifically anchoring bias, when humans are asked to evaluate the output of conversational agents. Our results provide insight into how best to evaluate conversational agents. We find increased consistency in ratings across two experimental conditions may be a result of anchoring bias. We also determine that external factors such as time and prior experience in similar tasks have effects on inter-rater consistency.

READ FULL TEXT

page 5

page 7

page 8

research
10/15/2020

Deciding Fast and Slow: The Role of Cognitive Biases in AI-assisted Decision-making

Several strands of research have aimed to bridge the gap between artific...
research
05/18/2023

CHBias: Bias Evaluation and Mitigation of Chinese Conversational Language Models

Warning: This paper contains content that may be offensive or upsetting....
research
10/15/2021

Training Conversational Agents with Generative Conversational Networks

Rich, open-domain textual data available on the web resulted in great ad...
research
01/07/2020

Multipurpose Intelligent Process Automation via Conversational Assistant

Intelligent Process Automation (IPA) is an emerging technology with a pr...
research
12/02/2021

Conversational Agents in Therapeutic Interventions for Neurodevelopmental Disorders: A Survey

Neurodevelopmental Disorders (NDD) are a group of conditions with onset ...
research
06/04/2017

Planning with Multiple Biases

Recent work has considered theoretical models for the behavior of agents...
research
05/08/2023

Do Large Language Models Show Decision Heuristics Similar to Humans? A Case Study Using GPT-3.5

A Large Language Model (LLM) is an artificial intelligence system that h...

Please sign up or login with your details

Forgot password? Click here to reset