Orthogonal SVD Covariance Conditioning and Latent Disentanglement

12/11/2022
by   Yue Song, et al.
0

Inserting an SVD meta-layer into neural networks is prone to make the covariance ill-conditioned, which could harm the model in the training stability and generalization abilities. In this paper, we systematically study how to improve the covariance conditioning by enforcing orthogonality to the Pre-SVD layer. Existing orthogonal treatments on the weights are first investigated. However, these techniques can improve the conditioning but would hurt the performance. To avoid such a side effect, we propose the Nearest Orthogonal Gradient (NOG) and Optimal Learning Rate (OLR). The effectiveness of our methods is validated in two applications: decorrelated Batch Normalization (BN) and Global Covariance Pooling (GCP). Extensive experiments on visual recognition demonstrate that our methods can simultaneously improve covariance conditioning and generalization. The combinations with orthogonal weight can further boost the performance. Moreover, we show that our orthogonality techniques can benefit generative models for better latent disentanglement through a series of experiments on various benchmarks. Code is available at: \href{https://github.com/KingJamesSong/OrthoImproveCond}{https://github.com/KingJamesSong/OrthoImproveCond}.

READ FULL TEXT

page 7

page 9

page 10

page 11

page 13

research
07/05/2022

Improving Covariance Conditioning of the SVD Meta-layer by Orthogonality

Inserting an SVD meta-layer into neural networks is prone to make the co...
research
01/29/2022

Fast Differentiable Matrix Square Root and Inverse Square Root

Computing the matrix square root and its inverse in a differentiable man...
research
05/26/2022

On the Eigenvalues of Global Covariance Pooling for Fine-grained Visual Recognition

The Fine-Grained Visual Categorization (FGVC) is challenging because the...
research
10/19/2020

MimicNorm: Weight Mean and Last BN Layer Mimic the Dynamic of Batch Normalization

Substantial experiments have validated the success of Batch Normalizatio...
research
05/06/2021

Why Approximate Matrix Square Root Outperforms Accurate SVD in Global Covariance Pooling?

Global covariance pooling (GCP) aims at exploiting the second-order stat...
research
03/22/2021

Delving into Variance Transmission and Normalization: Shift of Average Gradient Makes the Network Collapse

Normalization operations are essential for state-of-the-art neural netwo...
research
06/16/2018

A New High Performance and Scalable SVD algorithm on Distributed Memory Systems

This paper introduces a high performance implementation of Zolo-SVD algo...

Please sign up or login with your details

Forgot password? Click here to reset