RGB-T Semantic Segmentation with Location, Activation, and Sharpening

by   Gongyang Li, et al.

Semantic segmentation is important for scene understanding. To address the scenes of adverse illumination conditions of natural images, thermal infrared (TIR) images are introduced. Most existing RGB-T semantic segmentation methods follow three cross-modal fusion paradigms, i.e. encoder fusion, decoder fusion, and feature fusion. Some methods, unfortunately, ignore the properties of RGB and TIR features or the properties of features at different levels. In this paper, we propose a novel feature fusion-based network for RGB-T semantic segmentation, named LASNet, which follows three steps of location, activation, and sharpening. The highlight of LASNet is that we fully consider the characteristics of cross-modal features at different levels, and accordingly propose three specific modules for better segmentation. Concretely, we propose a Collaborative Location Module (CLM) for high-level semantic features, aiming to locate all potential objects. We propose a Complementary Activation Module for middle-level features, aiming to activate exact regions of different objects. We propose an Edge Sharpening Module (ESM) for low-level texture features, aiming to sharpen the edges of objects. Furthermore, in the training phase, we attach a location supervision and an edge supervision after CLM and ESM, respectively, and impose two semantic supervisions in the decoder part to facilitate network convergence. Experimental results on two public datasets demonstrate that the superiority of our LASNet over relevant state-of-the-art methods. The code and results of our method are available at https://github.com/MathLee/LASNet.


page 1

page 4

page 8


CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers

The performance of semantic segmentation of RGB images can be advanced b...

Edge-aware Guidance Fusion Network for RGB Thermal Scene Parsing

RGB thermal scene parsing has recently attracted increasing research int...

Progressive Glass Segmentation

Glass is very common in the real world. Influenced by the uncertainty ab...

SpiderMesh: Spatial-aware Demand-guided Recursive Meshing for RGB-T Semantic Segmentation

For semantic segmentation in urban scene understanding, RGB cameras alon...

ACNet: Attention Based Network to Exploit Complementary Features for RGBD Semantic Segmentation

Compared to RGB semantic segmentation, RGBD semantic segmentation can ac...

Variational Probabilistic Fusion Network for RGB-T Semantic Segmentation

RGB-T semantic segmentation has been widely adopted to handle hard scene...

Model-based inexact graph matching on top of CNNs for semantic scene understanding

Deep learning based pipelines for semantic segmentation often ignore str...

Please sign up or login with your details

Forgot password? Click here to reset