Cross-modal Contrastive Learning for Multimodal Fake News Detection

by   Longzheng Wang, et al.

Automatic detection of multimodal fake news has gained a widespread attention recently. Many existing approaches seek to fuse unimodal features to produce multimodal news representations. However, the potential of powerful cross-modal contrastive learning methods for fake news detection has not been well exploited. Besides, how to aggregate features from different modalities to boost the performance of the decision-making process is still an open question. To address that, we propose COOLANT, a cross-modal contrastive learning framework for multimodal fake news detection, aiming to achieve more accurate image-text alignment. To further improve the alignment precision, we leverage an auxiliary task to soften the loss term of negative samples during the contrast process. A cross-modal fusion module is developed to learn the cross-modality correlations. An attention mechanism with an attention guidance module is implemented to help effectively and interpretably aggregate the aligned unimodal representations and the cross-modality correlations. Finally, we evaluate the COOLANT and conduct a comparative study on two widely used datasets, Twitter and Weibo. The experimental results demonstrate that our COOLANT outperforms previous approaches by a large margin and achieves new state-of-the-art results on the two datasets.


page 1

page 4


Multimodal Fake News Detection via CLIP-Guided Learning

Multimodal fake news detection has attracted many research interests in ...

Multimodal Fake News Detection with Adaptive Unimodal Representation Aggregation

The development of Internet technology has continuously intensified the ...

Multimodal Analytics for Real-world News using Measures of Cross-modal Entity Consistency

The World Wide Web has become a popular source for gathering information...

Detecting Out-of-Context Multimodal Misinformation with interpretable neural-symbolic model

Recent years have witnessed the sustained evolution of misinformation th...

Co-Attentive Cross-Modal Deep Learning for Medical Evidence Synthesis and Decision Making

Modern medicine requires generalised approaches to the synthesis and int...

Multimodal Matching-aware Co-attention Networks with Mutual Knowledge Distillation for Fake News Detection

Fake news often involves multimedia information such as text and image t...

CAMANet: Class Activation Map Guided Attention Network for Radiology Report Generation

Radiology report generation (RRG) has gained increasing research attenti...

Please sign up or login with your details

Forgot password? Click here to reset