Multimodal Fake News Detection via CLIP-Guided Learning

by   Yangming Zhou, et al.

Multimodal fake news detection has attracted many research interests in social forensics. Many existing approaches introduce tailored attention mechanisms to guide the fusion of unimodal features. However, how the similarity of these features is calculated and how it will affect the decision-making process in FND are still open questions. Besides, the potential of pretrained multi-modal feature learning models in fake news detection has not been well exploited. This paper proposes a FND-CLIP framework, i.e., a multimodal Fake News Detection network based on Contrastive Language-Image Pretraining (CLIP). Given a targeted multimodal news, we extract the deep representations from the image and text using a ResNet-based encoder, a BERT-based encoder and two pair-wise CLIP encoders. The multimodal feature is a concatenation of the CLIP-generated features weighted by the standardized cross-modal similarity of the two modalities. The extracted features are further processed for redundancy reduction before feeding them into the final classifier. We introduce a modality-wise attention module to adaptively reweight and aggregate the features. We have conducted extensive experiments on typical fake news datasets. The results indicate that the proposed framework has a better capability in mining crucial features for fake news detection. The proposed FND-CLIP can achieve better performances than previous works, i.e., 0.7%, 6.8% and 1.3% improvements in overall accuracy on Weibo, Politifact and Gossipcop, respectively. Besides, we justify that CLIP-based learning can allow better flexibility on multimodal feature selection.


page 2

page 4

page 10


Multimodal Fake News Detection with Adaptive Unimodal Representation Aggregation

The development of Internet technology has continuously intensified the ...

Cross-modal Contrastive Learning for Multimodal Fake News Detection

Automatic detection of multimodal fake news has gained a widespread atte...

Similarity-Aware Multimodal Prompt Learning for Fake News Detection

The standard paradigm for fake news detection mainly utilizes text infor...

Multimodal Short Video Rumor Detection System Based on Contrastive Learning

With short video platforms becoming one of the important channels for ne...

FakeSV: A Multimodal Benchmark with Rich Social Context for Fake News Detection on Short Video Platforms

Short video platforms have become an important channel for news sharing,...

Detecting Out-of-Context Multimodal Misinformation with interpretable neural-symbolic model

Recent years have witnessed the sustained evolution of misinformation th...

Multimodal Matching-aware Co-attention Networks with Mutual Knowledge Distillation for Fake News Detection

Fake news often involves multimedia information such as text and image t...

Please sign up or login with your details

Forgot password? Click here to reset