Resolution-Aware Design of Atrous Rates for Semantic Segmentation Networks

by   Bum Jun Kim, et al.

DeepLab is a widely used deep neural network for semantic segmentation, whose success is attributed to its parallel architecture called atrous spatial pyramid pooling (ASPP). ASPP uses multiple atrous convolutions with different atrous rates to extract both local and global information. However, fixed values of atrous rates are used for the ASPP module, which restricts the size of its field of view. In principle, atrous rate should be a hyperparameter to change the field of view size according to the target task or dataset. However, the manipulation of atrous rate is not governed by any guidelines. This study proposes practical guidelines for obtaining an optimal atrous rate. First, an effective receptive field for semantic segmentation is introduced to analyze the inner behavior of segmentation networks. We observed that the use of ASPP module yielded a specific pattern in the effective receptive field, which was traced to reveal the module's underlying mechanism. Accordingly, we derive practical guidelines for obtaining the optimal atrous rate, which should be controlled based on the size of input image. Compared to other values, using the optimal atrous rate consistently improved the segmentation results across multiple datasets, including the STARE, CHASE_DB1, HRF, Cityscapes, and iSAID datasets.


page 5

page 6

page 11

page 12

page 13


Learning Dilation Factors for Semantic Segmentation of Street Scenes

Contextual information is crucial for semantic segmentation. However, fi...

Dilated Point Convolutions: On the Receptive Field of Point Convolutions

In this work, we propose Dilated Point Convolutions (DPC) which drastica...

Adaptive Context Encoding Module for Semantic Segmentation

The object sizes in images are diverse, therefore, capturing multiple sc...

MRNet: Multiple-Input Receptive Field Network for Large-Scale Point Cloud Segmentation

The size of the input receptive field is one of the most critical aspect...

Autofocus Layer for Semantic Segmentation

We propose the autofocus convolutional layer for semantic segmentation w...

Dilated Convolutions with Lateral Inhibitions for Semantic Image Segmentation

Dilated convolutions are widely used in deep semantic segmentation model...

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

Currently, the neural network architecture design is mostly guided by th...

Please sign up or login with your details

Forgot password? Click here to reset