Parameter-Efficient Fine-Tuning without Introducing New Latency

05/26/2023
by   Baohao Liao, et al.
0

Parameter-efficient fine-tuning (PEFT) of pre-trained language models has recently demonstrated remarkable achievements, effectively matching the performance of full fine-tuning while utilizing significantly fewer trainable parameters, and consequently addressing the storage and communication constraints. Nonetheless, various PEFT methods are limited by their inherent characteristics. In the case of sparse fine-tuning, which involves modifying only a small subset of the existing parameters, the selection of fine-tuned parameters is task- and domain-specific, making it unsuitable for federated learning. On the other hand, PEFT methods with adding new parameters typically introduce additional inference latency. In this paper, we demonstrate the feasibility of generating a sparse mask in a task-agnostic manner, wherein all downstream tasks share a common mask. Our approach, which relies solely on the magnitude information of pre-trained parameters, surpasses existing methodologies by a significant margin when evaluated on the GLUE benchmark. Additionally, we introduce a novel adapter technique that directly applies the adapter to pre-trained parameters instead of the hidden representation, thereby achieving identical inference speed to that of full fine-tuning. Through extensive experiments, our proposed method attains a new state-of-the-art outcome in terms of both performance and storage efficiency, storing only 0.03 parameters of full fine-tuning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/24/2020

How fine can fine-tuning be? Learning efficient language models

State-of-the-art performance on language understanding tasks is now achi...
research
09/15/2023

SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient Channels

Pre-trained vision transformers have strong representation benefits to v...
research
11/29/2022

SPARTAN: Sparse Hierarchical Memory for Parameter-Efficient Transformers

Fine-tuning pre-trained language models (PLMs) achieves impressive perfo...
research
04/13/2023

DiffFit: Unlocking Transferability of Large Diffusion Models via Simple Parameter-Efficient Fine-Tuning

Diffusion models have proven to be highly effective in generating high-q...
research
06/04/2021

Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators

This paper presents a novel pre-trained language models (PLM) compressio...
research
08/28/2023

SAM-PARSER: Fine-tuning SAM Efficiently by Parameter Space Reconstruction

Segment Anything Model (SAM) has received remarkable attention as it off...
research
05/16/2023

GIFT: Graph-Induced Fine-Tuning for Multi-Party Conversation Understanding

Addressing the issues of who saying what to whom in multi-party conversa...

Please sign up or login with your details

Forgot password? Click here to reset