Revisit Parameter-Efficient Transfer Learning: A Two-Stage Paradigm

03/14/2023
by   Hengyuan Zhao, et al.
0

Parameter-Efficient Transfer Learning (PETL) aims at efficiently adapting large models pre-trained on massive data to downstream tasks with limited task-specific data. In view of the practicality of PETL, previous works focus on tuning a small set of parameters for each downstream task in an end-to-end manner while rarely considering the task distribution shift issue between the pre-training task and the downstream task. This paper proposes a novel two-stage paradigm, where the pre-trained model is first aligned to the target distribution. Then the task-relevant information is leveraged for effective adaptation. Specifically, the first stage narrows the task distribution shift by tuning the scale and shift in the LayerNorm layers. In the second stage, to efficiently learn the task-relevant information, we propose a Taylor expansion-based importance score to identify task-relevant channels for the downstream task and then only tune such a small portion of channels, making the adaptation to be parameter-efficient. Overall, we present a promising new direction for PETL, and the proposed paradigm achieves state-of-the-art performance on the average accuracy of 19 downstream tasks.

READ FULL TEXT
research
05/24/2023

Refocusing Is Key to Transfer Learning

Transfer learning involves adapting a pre-trained model to novel downstr...
research
02/26/2023

Scalable Weight Reparametrization for Efficient Transfer Learning

This paper proposes a novel, efficient transfer learning method, called ...
research
03/17/2023

A Unified Continual Learning Framework with General Parameter-Efficient Tuning

The "pre-training → downstream adaptation" presents both new opportuniti...
research
09/28/2020

Scalable Transfer Learning with Expert Models

Transfer of pre-trained representations can improve sample efficiency an...
research
04/04/2022

SHiFT: An Efficient, Flexible Search Engine for Transfer Learning

Transfer learning can be seen as a data- and compute-efficient alternati...
research
03/01/2023

Rethinking Efficient Tuning Methods from a Unified Perspective

Parameter-efficient transfer learning (PETL) based on large-scale pre-tr...
research
05/17/2023

G-Adapter: Towards Structure-Aware Parameter-Efficient Transfer Learning for Graph Transformer Networks

It has become a popular paradigm to transfer the knowledge of large-scal...

Please sign up or login with your details

Forgot password? Click here to reset