Optimization on Product Submanifolds of Convolution Kernels

01/22/2017
by   M. Ozay, et al.
0

Recent advances in optimization methods used for training convolutional neural networks (CNNs) with kernels, which are normalized according to particular constraints, have shown remarkable success. This work introduces an approach for training CNNs using ensembles of joint spaces of kernels constructed using different constraints. For this purpose, we address a problem of optimization on ensembles of products of submanifolds (PEMs) of convolution kernels. To this end, we first propose three strategies to construct ensembles of PEMs in CNNs. Next, we expound their geometric properties (metric and curvature properties) in CNNs. We make use of our theoretical results by developing a geometry-aware SGD algorithm (G-SGD) for optimization on ensembles of PEMs to train CNNs. Moreover, we analyze convergence properties of G-SGD considering geometric properties of PEMs. In the experimental analyses, we employ G-SGD to train CNNs on Cifar-10, Cifar-100 and Imagenet datasets. The results show that geometric adaptive step size computation methods of G-SGD can improve training loss and convergence properties of CNNs. Moreover, we observe that classification performance of baseline CNNs can be boosted using G-SGD on ensembles of PEMs identified by multiple constraints.

READ FULL TEXT

page 1

page 2

page 3

research
10/22/2016

Optimization on Submanifolds of Convolution Kernels in CNNs

Kernel normalization methods have been employed to improve robustness of...
research
04/08/2019

Centripetal SGD for Pruning Very Deep Convolutional Networks with Complicated Structure

The redundancy is widely recognized in Convolutional Neural Networks (CN...
research
07/30/2021

Manipulating Identical Filter Redundancy for Efficient Pruning on Deep and Complicated CNN

The existence of redundancy in Convolutional Neural Networks (CNNs) enab...
research
03/05/2021

Second-order step-size tuning of SGD for non-convex optimization

In view of a direct and simple improvement of vanilla SGD, this paper pr...
research
02/27/2018

Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs

The loss functions of deep neural networks are complex and their geometr...
research
05/22/2019

Fine-grained Optimization of Deep Neural Networks

In recent studies, several asymptotic upper bounds on generalization err...
research
10/03/2022

Analysis of (sub-)Riemannian PDE-G-CNNs

Group equivariant convolutional neural networks (G-CNNs) have been succe...

Please sign up or login with your details

Forgot password? Click here to reset