Statistically Motivated Second Order Pooling
Second-order pooling, a.k.a. bilinear pooling, has proven effective for visual recognition. The recent progress in this area has focused on either designing normalization techniques for second-order models, or compressing the second-order representations. However, these two directions have typically been followed separately, and without any clear statistical motivation. Here, by contrast, we introduce a statistically-motivated framework that jointly tackles normalization and compression of second-order representations. To this end, we design a parametric vectorization layer, which maps a covariance matrix, known to follow a Wishart distribution, to a vector whose elements can be shown to follow a Chi-square distribution. We then propose to make use of a square-root normalization, which makes the distribution of the resulting representation converge to a Gaussian, thus complying with the standard machine learning assumption. As evidenced by our experiments, this lets us outperform the state-of-the-art second-order models on several benchmark recognition datasets.
READ FULL TEXT