See Better Before Looking Closer: Weakly Supervised Data Augmentation Network for Fine-Grained Visual Classification
Data augmentation is usually adopted to increase the amount of training data, prevent overfitting and improve the performance of deep models. However, in practice, the effect of regular data augmentation, such as random image crop, is limited since it might introduce much uncontrolled background noise. In this paper, we propose Weakly-Supervised Data Augmentation Network (WS-DAN) to explore the potential of data augmentation. Specifically, for each training image, we first generate attention maps to represent the object's discriminative parts by weakly supervised Learning. Next, we randomly choose one attention map to augment this image, including attention crop and attention drop. Weakly-supervised data augmentation network improves the classification accuracy in two folds. On the one hand, images can be seen better since multiple object parts can be activated. On the other hand, attention regions provide spatial information of objects, which can make images be looked closer to further improve the performance. Comprehensive experiments in common fine-grained visual classification datasets show that our method surpasses the state-of-the-art methods by a large margin, which demonstrated the effectiveness of the proposed method.
READ FULL TEXT