Sparse then Prune: Toward Efficient Vision Transformers

07/22/2023
by   Yogi Prasetyo, et al.
0

The Vision Transformer architecture is a deep learning model inspired by the success of the Transformer model in Natural Language Processing. However, the self-attention mechanism, large number of parameters, and the requirement for a substantial amount of training data still make Vision Transformers computationally burdensome. In this research, we investigate the possibility of applying Sparse Regularization to Vision Transformers and the impact of Pruning, either after Sparse Regularization or without it, on the trade-off between performance and efficiency. To accomplish this, we apply Sparse Regularization and Pruning methods to the Vision Transformer architecture for image classification tasks on the CIFAR-10, CIFAR-100, and ImageNet-100 datasets. The training process for the Vision Transformer model consists of two parts: pre-training and fine-tuning. Pre-training utilizes ImageNet21K data, followed by fine-tuning for 20 epochs. The results show that when testing with CIFAR-100 and ImageNet-100 data, models with Sparse Regularization can increase accuracy by 0.12 Regularization yields even better results. Specifically, it increases the average accuracy by 0.568 ImageNet-100 data compared to pruning models without Sparse Regularization. Code can be accesed here: https://github.com/yogiprsty/Sparse-ViT

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/21/2022

Vision Transformers in 2022: An Update on Tiny ImageNet

The recent advances in image transformers have shown impressive results ...
research
03/18/2022

Three things everyone should know about Vision Transformers

After their initial success in natural language processing, transformer ...
research
09/28/2021

Fine-tuning Vision Transformers for the Prediction of State Variables in Ising Models

Transformers are state-of-the-art deep learning models that are composed...
research
06/08/2021

Chasing Sparsity in Vision Transformers: An End-to-End Exploration

Vision transformers (ViTs) have recently received explosive popularity, ...
research
03/20/2023

EVA-02: A Visual Representation for Neon Genesis

We launch EVA-02, a next-generation Transformer-based visual representat...
research
10/10/2021

NViT: Vision Transformer Compression and Parameter Redistribution

Transformers yield state-of-the-art results across many tasks. However, ...
research
09/06/2021

Vision Transformers For Weeds and Crops Classification Of High Resolution UAV Images

Crop and weed monitoring is an important challenge for agriculture and f...

Please sign up or login with your details

Forgot password? Click here to reset