Edge-aware Guidance Fusion Network for RGB Thermal Scene Parsing

by   Wujie Zhou, et al.

RGB thermal scene parsing has recently attracted increasing research interest in the field of computer vision. However, most existing methods fail to perform good boundary extraction for prediction maps and cannot fully use high level features. In addition, these methods simply fuse the features from RGB and thermal modalities but are unable to obtain comprehensive fused features. To address these problems, we propose an edge-aware guidance fusion network (EGFNet) for RGB thermal scene parsing. First, we introduce a prior edge map generated using the RGB and thermal images to capture detailed information in the prediction map and then embed the prior edge information in the feature maps. To effectively fuse the RGB and thermal information, we propose a multimodal fusion module that guarantees adequate cross-modal fusion. Considering the importance of high level semantic information, we propose a global information module and a semantic information module to extract rich semantic information from the high-level features. For decoding, we use simple elementwise addition for cascaded feature fusion. Finally, to improve the parsing accuracy, we apply multitask deep supervision to the semantic and boundary maps. Extensive experiments were performed on benchmark datasets to demonstrate the effectiveness of the proposed EGFNet and its superior performance compared with state of the art methods. The code and results can be found at https://github.com/ShaohuaDong2021/EGFNet.


RGB-T Semantic Segmentation with Location, Activation, and Sharpening

Semantic segmentation is important for scene understanding. To address t...

Correlating Edge, Pose with Parsing

According to existing studies, human body edge and pose are two benefici...

Multi-Modal Hybrid Learning and Sequential Training for RGB-T Saliency Detection

RGB-T saliency detection has emerged as an important computer vision tas...

Trimap-guided Feature Mining and Fusion Network for Natural Image Matting

Utilizing trimap guidance and fusing multi-level features are two import...

Flare-Aware Cross-modal Enhancement Network for Multi-spectral Vehicle Re-identification

Multi-spectral vehicle re-identification aims to address the challenge o...

SpiderMesh: Spatial-aware Demand-guided Recursive Meshing for RGB-T Semantic Segmentation

For semantic segmentation in urban scene understanding, RGB cameras alon...

Semantic Flow for Fast and Accurate Scene Parsing

In this paper, we focus on effective methods for fast and accurate scene...

Please sign up or login with your details

Forgot password? Click here to reset