Robust Information Retrieval for False Claims with Distracting Entities In Fact Extraction and Verification

by   Mingwen Dong, et al.

Accurate evidence retrieval is essential for automated fact checking. Little previous research has focused on the differences between true and false claims and how they affect evidence retrieval. This paper shows that, compared with true claims, false claims more frequently contain irrelevant entities which can distract evidence retrieval model. A BERT-based retrieval model made more mistakes in retrieving refuting evidence for false claims than supporting evidence for true claims. When tested with adversarial false claims (synthetically generated) containing irrelevant entities, the recall of the retrieval model is significantly lower than that for original claims. These results suggest that the vanilla BERT-based retrieval model is not robust to irrelevant entities in the false claims. By augmenting the training data with synthetic false claims containing irrelevant entities, the trained model achieved higher evidence recall, including that of false claims with irrelevant entities. In addition, using separate models to retrieve refuting and supporting evidence and then aggregating them can also increase the evidence recall, including that of false claims with irrelevant entities. These results suggest that we can increase the BERT-based retrieval model's robustness to false claims with irrelevant entities via data augmentation and model ensemble.


page 1

page 2

page 3

page 4


BERT for Evidence Retrieval and Claim Verification

Motivated by the promising performance of pre-trained language models, w...

FAKTA: An Automatic End-to-End Fact Checking System

We present FAKTA which is a unified framework that integrates various co...

RELIC: Retrieving Evidence for Literary Claims

Humanities scholars commonly provide evidence for claims that they make ...

Zero-shot Fact Verification by Claim Generation

Neural models for automated fact verification have achieved promising re...

True or false? Cognitive load when reading COVID-19 news headlines: an eye-tracking study

Misinformation is an important topic in the Information Retrieval (IR) c...

BERT based patent novelty search by training claims to their own description

In this paper we present a method to concatenate patent claims to their ...

Multilingual Evidence Retrieval and Fact Verification to Combat Global Disinformation: The Power of Polyglotism

This article investigates multilingual evidence retrieval and fact verif...

Please sign up or login with your details

Forgot password? Click here to reset