Effectiveness of Scaled Exponentially-Regularized Linear Units (SERLUs)

07/26/2018
by   G. Zhang, et al.
0

Recently, self-normalizing neural networks (SNNs) have been proposed with the intention to avoid batch or weight normalization. The key step in SNNs is to properly scale the exponential linear unit (referred to as SELU) to inherently incorporate normalization based on central limit theory. SELU is a monotonically increasing function, where it has an approximately constant negative output for large negative input. In this work, we propose a new activation function to break the monotonicity property of SELU while still preserving the self-normalizing property. Differently from SELU, the new function introduces a bump-shaped function in the region of negative input by regularizing a linear function with a scaled exponential function, which is referred to as a scaled exponentially-regularized linear unit (SERLU). The bump-shaped function has approximately zero response to large negative input while being able to push the output of SERLU towards zero mean statistically. To effectively combat over-fitting, we develop a so-called shift-dropout for SERLU, which includes standard dropout as a special case. Experimental results on MNIST, CIFAR10 and CIFAR100 show that SERLU-based neural networks provide consistently promising results in comparison to other 5 activation functions including ELU, SELU, Swish, Leakly ReLU and ReLU.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/08/2023

TaLU: A Hybrid Activation Function Combining Tanh and Rectified Linear Unit to Enhance Neural Networks

The application of the deep learning model in classification plays an im...
research
10/17/2022

Nish: A Novel Negative Stimulated Hybrid Activation Function

An activation function has a significant impact on the efficiency and ro...
research
05/05/2015

Empirical Evaluation of Rectified Activations in Convolutional Network

In this paper we investigate the performance of different types of recti...
research
11/10/2019

Symmetrical Gaussian Error Linear Units (SGELUs)

In this paper, a novel neural network activation function, called Symmet...
research
02/01/2018

Training Neural Networks by Using Power Linear Units (PoLUs)

In this paper, we introduce "Power Linear Unit" (PoLU) which increases t...
research
03/01/2020

Soft-Root-Sign Activation Function

The choice of activation function in deep networks has a significant eff...
research
10/04/2019

SNDCNN: Self-normalizing deep CNNs with scaled exponential linear units for speech recognition

Very deep CNNs achieve state-of-the-art results in both computer vision ...

Please sign up or login with your details

Forgot password? Click here to reset