Non-linear operations such as GELU, Layer normalization, and Softmax are...
Although deep convolutional networks have achieved improved performance ...
Back-propagation has been the workhorse of recent successes of deep lear...
The ICML 2013 Workshop on Challenges in Representation Learning focused ...