NViT: Vision Transformer Compression and Parameter Redistribution

10/10/2021
by   Huanrui Yang, et al.
0

Transformers yield state-of-the-art results across many tasks. However, they still impose huge computational costs during inference. We apply global, structural pruning with latency-aware regularization on all parameters of the Vision Transformer (ViT) model for latency reduction. Furthermore, we analyze the pruned architectures and find interesting regularities in the final weight structure. Our discovered insights lead to a new architecture called NViT (Novel ViT), with a redistribution of where parameters are used. This architecture utilizes parameters more efficiently and enables control of the latency-accuracy trade-off. On ImageNet-1K, we prune the DEIT-Base (Touvron et al., 2021) model to a 2.6x FLOPs reduction, 5.1x parameter reduction, and 1.9x run-time speedup with only 0.07 accuracy gain when compressing the base model to the throughput of the Small/Tiny variants. NViT gains 0.1-1.1 family when trained from scratch, while being faster.

READ FULL TEXT
research
05/26/2023

COMCAT: Towards Efficient Compression and Customization of Attention-Based Vision Models

Attention-based vision models, such as Vision Transformer (ViT) and its ...
research
03/24/2023

FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization

The recent amalgamation of transformer and convolutional designs has led...
research
07/22/2023

Sparse then Prune: Toward Efficient Vision Transformers

The Vision Transformer architecture is a deep learning model inspired by...
research
10/20/2021

HALP: Hardware-Aware Latency Pruning

Structural pruning can simplify network architecture and improve inferen...
research
10/13/2022

Structural Pruning via Latency-Saliency Knapsack

Structural pruning can simplify network architecture and improve inferen...
research
03/04/2023

A Fast Training-Free Compression Framework for Vision Transformers

Token pruning has emerged as an effective solution to speed up the infer...
research
05/26/2021

Aggregating Nested Transformers

Although hierarchical structures are popular in recent vision transforme...

Please sign up or login with your details

Forgot password? Click here to reset