PerPLM: Personalized Fine-tuning of Pretrained Language Models via Writer-specific Intermediate Learning and Prompts

09/14/2023
by   Daisuke Oba, et al.
0

The meanings of words and phrases depend not only on where they are used (contexts) but also on who use them (writers). Pretrained language models (PLMs) are powerful tools for capturing context, but they are typically pretrained and fine-tuned for universal use across different writers. This study aims to improve the accuracy of text understanding tasks by personalizing the fine-tuning of PLMs for specific writers. We focus on a general setting where only the plain text from target writers are available for personalization. To avoid the cost of fine-tuning and storing multiple copies of PLMs for different users, we exhaustively explore using writer-specific prompts to personalize a unified PLM. Since the design and evaluation of these prompts is an underdeveloped area, we introduce and compare different types of prompts that are possible in our setting. To maximize the potential of prompt-based personalized fine-tuning, we propose a personalized intermediate learning based on masked language modeling to extract task-independent traits of writers' text. Our experiments, using multiple tasks, datasets, and PLMs, reveal the nature of different prompts and the effectiveness of our intermediate learning approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/26/2021

Rethinking Why Intermediate-Task Fine-Tuning Works

Supplementary Training on Intermediate Labeled-data Tasks (STILTs) is a ...
research
09/21/2021

Stepmothers are mean and academics are pretentious: What do pretrained language models learn about you?

In this paper, we investigate what types of stereotypical information ar...
research
11/12/2018

Fine-tuning of Language Models with Discriminator

Cross-entropy loss is a common choice when it comes to multiclass classi...
research
06/12/2022

DeepEmotex: Classifying Emotion in Text Messages using Deep Transfer Learning

Transfer learning has been widely used in natural language processing th...
research
08/26/2021

Fine-tuning Pretrained Language Models with Label Attention for Explainable Biomedical Text Classification

The massive growth of digital biomedical data is making biomedical text ...
research
05/08/2022

Context-Aware Abbreviation Expansion Using Large Language Models

Motivated by the need for accelerating text entry in augmentative and al...
research
01/15/2021

Grid Search Hyperparameter Benchmarking of BERT, ALBERT, and LongFormer on DuoRC

The purpose of this project is to evaluate three language models named B...

Please sign up or login with your details

Forgot password? Click here to reset