What You Expect is NOT What You Get! Questioning Reconstruction/Classification Correlation of Stacked Convolutional Auto-Encoder Features
In this paper, we thoroughly investigate the quality of features produced by deep neural network architectures obtained by stacking and convolving Auto-Encoders. In particular, we are interested into the relation of their reconstruction score with their performance on document layout analysis. When using Auto-Encoders, intuitively one could assume that features which are good for reconstruction will also lead to high classification accuracies. However, we prove that this is not always the case. We examine the reconstruction score, training error and the results obtained if we were to use the same features for both input reconstruction and a classification task. We show that the reconstruction score is not a good metric because it is biased by the decoder quality. Furthermore, experimental results suggest that there is no correlation between the reconstruction score and the quality of features for a classification task and that given the network size and configuration it is not possible to make assumptions on its training error magnitude. Therefore we conclude that both, reconstruction score and training error should not be used jointly to evaluate the quality of the features produced by a Stacked Convolutional Auto-Encoders for a classification task. Consequently one should independently investigate the network classification abilities directly.
READ FULL TEXT