Multi-Exit Vision Transformer for Dynamic Inference

06/29/2021
by   Arian Bakhtiarnia, et al.
12

Deep neural networks can be converted to multi-exit architectures by inserting early exit branches after some of their intermediate layers. This allows their inference process to become dynamic, which is useful for time critical IoT applications with stringent latency requirements, but with time-variant communication and computation resources. In particular, in edge computing systems and IoT networks where the exact computation time budget is variable and not known beforehand. Vision Transformer is a recently proposed architecture which has since found many applications across various domains of computer vision. In this work, we propose seven different architectures for early exit branches that can be used for dynamic inference in Vision Transformer backbones. Through extensive experiments involving both classification and regression problems, we show that each one of our proposed architectures could prove useful in the trade-off between accuracy and speed.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/19/2021

Single-Layer Vision Transformers for More Accurate Early Exits with Less Overhead

Deploying deep learning models in time-critical applications with limite...
research
04/21/2021

Improving the Accuracy of Early Exits in Multi-Exit Architectures via Curriculum Learning

Deploying deep learning services for time-sensitive and resource-constra...
research
12/06/2022

Enabling and Accelerating Dynamic Vision Transformer Inference for Real-Time Applications

Many state-of-the-art deep learning models for computer vision tasks are...
research
08/31/2022

Efficient Sparsely Activated Transformers

Transformer-based neural networks have achieved state-of-the-art task pe...
research
06/09/2021

Zero Time Waste: Recycling Predictions in Early Exit Neural Networks

The problem of reducing processing time of large deep learning models is...
research
02/17/2020

Controlling Computation versus Quality for Neural Sequence Models

Most neural networks utilize the same amount of compute for every exampl...
research
06/05/2023

Towards Anytime Classification in Early-Exit Architectures by Enforcing Conditional Monotonicity

Modern predictive models are often deployed to environments in which com...

Please sign up or login with your details

Forgot password? Click here to reset