Hydra: Multi-head Low-rank Adaptation for Parameter Efficient Fine-tuning

by   Sanghyeon Kim, et al.

The recent surge in large-scale foundation models has spurred the development of efficient methods for adapting these models to various downstream tasks. Low-rank adaptation methods, such as LoRA, have gained significant attention due to their outstanding parameter efficiency and no additional inference latency. This paper investigates a more general form of adapter module based on the analysis that parallel and sequential adaptation branches learn novel and general features during fine-tuning, respectively. The proposed method, named Hydra, due to its multi-head computational branches, combines parallel and sequential branch to integrate capabilities, which is more expressive than existing single branch methods and enables the exploration of a broader range of optimal points in the fine-tuning process. In addition, the proposed adaptation method explicitly leverages the pre-trained weights by performing a linear combination of the pre-trained features. It allows the learned features to have better generalization performance across diverse downstream tasks. Furthermore, we perform a comprehensive analysis of the characteristics of each adaptation branch with empirical evidence. Through an extensive range of experiments, encompassing comparisons and ablation studies, we substantiate the efficiency and demonstrate the superior performance of Hydra. This comprehensive evaluation underscores the potential impact and effectiveness of Hydra in a variety of applications. Our code is available on <https://github.com/extremebird/Hydra>


KronA: Parameter Efficient Tuning with Kronecker Adapter

Fine-tuning a Pre-trained Language Model (PLM) on a specific downstream ...

LoRA: Low-Rank Adaptation of Large Language Models

The dominant paradigm of natural language processing consists of large-s...

One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning

We present Generalized LoRA (GLoRA), an advanced approach for universal ...

SAM-OCTA: A Fine-Tuning Strategy for Applying Foundation Model to OCTA Image Segmentation Tasks

In the analysis of optical coherence tomography angiography (OCTA) image...

Reprogramming under constraints: Revisiting efficient and reliable transferability of lottery tickets

In the era of foundation models with huge pre-training budgets, the down...

Exploring the impact of low-rank adaptation on the performance, efficiency, and regularization of RLHF

During the last stage of RLHF, a large language model is aligned to huma...

DyLoRA: Parameter Efficient Tuning of Pre-trained Models using Dynamic Search-Free Low-Rank Adaptation

With the ever-growing size of pre-trained models (PMs), fine-tuning them...

Please sign up or login with your details

Forgot password? Click here to reset