Revisiting the Parameter Efficiency of Adapters from the Perspective of Precision Redundancy

by   Shibo Jie, et al.

Current state-of-the-art results in computer vision depend in part on fine-tuning large pre-trained vision models. However, with the exponential growth of model sizes, the conventional full fine-tuning, which needs to store a individual network copy for each tasks, leads to increasingly huge storage and transmission overhead. Adapter-based Parameter-Efficient Tuning (PET) methods address this challenge by tuning lightweight adapters inserted into the frozen pre-trained models. In this paper, we investigate how to make adapters even more efficient, reaching a new minimum size required to store a task-specific fine-tuned network. Inspired by the observation that the parameters of adapters converge at flat local minima, we find that adapters are resistant to noise in parameter space, which means they are also resistant to low numerical precision. To train low-precision adapters, we propose a computational-efficient quantization method which minimizes the quantization error. Through extensive experiments, we find that low-precision adapters exhibit minimal performance degradation, and even 1-bit precision is sufficient for adapters. The experimental results demonstrate that 1-bit adapters outperform all other PET methods on both the VTAB-1K benchmark and few-shot FGVC tasks, while requiring the smallest storage size. Our findings show, for the first time, the significant potential of quantization techniques in PET, providing a general solution to enhance the parameter efficiency of adapter-based PET methods. Code:


page 3

page 4

page 6

page 7


VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks

Recently, fine-tuning language models pre-trained on large text corpora ...

AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models

There are growing interests in adapting large-scale language models usin...

Better Fine-Tuning by Reducing Representational Collapse

Although widely adopted, existing approaches for fine-tuning pre-trained...

FacT: Factor-Tuning for Lightweight Adaptation on Vision Transformer

Recent work has explored the potential to adapt a pre-trained vision tra...

Consolidator: Mergeable Adapter with Grouped Connections for Visual Adaptation

Recently, transformers have shown strong ability as visual feature extra...

VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control

As the model size of pre-trained language models (PLMs) grows rapidly, f...

Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning

The Mixture of Experts (MoE) is a widely known neural architecture where...

Please sign up or login with your details

Forgot password? Click here to reset