Refocusing Is Key to Transfer Learning

05/24/2023
by   Baifeng Shi, et al.
0

Transfer learning involves adapting a pre-trained model to novel downstream tasks. However, we observe that current transfer learning methods often fail to focus on task-relevant features. In this work, we emphasize the importance of refocusing the attention in transfer learning. We introduce Top-Down Attention Steering (TOAST), a novel transfer learning algorithm that keeps the pre-trained backbone frozen, while selecting the task-relevant elements in the output and feeding them back to the model to steer its attention to the task-specific features. By refocusing the attention only, TOAST achieves state-of-the-art results on a number of transfer learning benchmarks, while having a small portion of tunable parameters. Compared to fully fine-tuning, LoRA, and prompt tuning, TOAST substantially improves performance across a range of fine-grained visual classification datasets (e.g., 81.1 FGVC). TOAST also outperforms the fully fine-tuned Alpaca model on instruction-following language generation. Code is available at https://github.com/bfshi/TOAST.

READ FULL TEXT

page 2

page 5

page 7

page 8

page 16

page 17

page 18

research
03/07/2023

Introspective Cross-Attention Probing for Lightweight Transfer of Pre-trained Models

We propose InCA, a lightweight method for transfer learning that cross-a...
research
03/14/2023

Revisit Parameter-Efficient Transfer Learning: A Two-Stage Paradigm

Parameter-Efficient Transfer Learning (PETL) aims at efficiently adaptin...
research
06/12/2019

Pay Attention to Convolution Filters: Towards Fast and Accurate Fine-Grained Transfer Learning

We propose an efficient transfer learning method for adapting ImageNet p...
research
06/02/2023

Resolving Interference When Merging Models

Transfer learning - i.e., further fine-tuning a pre-trained model on a d...
research
08/17/2022

Auto-segmentation of Hip Joints using MultiPlanar UNet with Transfer learning

Accurate geometry representation is essential in developing finite eleme...
research
02/07/2020

Improving the Adversarial Robustness of Transfer Learning via Noisy Feature Distillation

Fine-tuning through knowledge transfer from a pre-trained model on a lar...
research
04/30/2023

Consolidator: Mergeable Adapter with Grouped Connections for Visual Adaptation

Recently, transformers have shown strong ability as visual feature extra...

Please sign up or login with your details

Forgot password? Click here to reset