Diverse Data Augmentation with Diffusions for Effective Test-time Prompt Tuning

08/11/2023
by   Chun-Mei Feng, et al.
0

Benefiting from prompt tuning, recent years have witnessed the promising performance of pre-trained vision-language models, e.g., CLIP, on versatile downstream tasks. In this paper, we focus on a particular setting of learning adaptive prompts on the fly for each test sample from an unseen new domain, which is known as test-time prompt tuning (TPT). Existing TPT methods typically rely on data augmentation and confidence selection. However, conventional data augmentation techniques, e.g., random resized crops, suffers from the lack of data diversity, while entropy-based confidence selection alone is not sufficient to guarantee prediction fidelity. To address these issues, we propose a novel TPT method, named DiffTPT, which leverages pre-trained diffusion models to generate diverse and informative new data. Specifically, we incorporate augmented data by both conventional method and pre-trained stable diffusion to exploit their respective merits, improving the models ability to adapt to unknown new test data. Moreover, to ensure the prediction fidelity of generated data, we introduce a cosine similarity-based filtration technique to select the generated data with higher similarity to the single test sample. Our experiments on test datasets with distribution shifts and unseen categories demonstrate that DiffTPT improves the zero-shot accuracy by an average of 5.13% compared to the state-of-the-art TPT method. Our code and models will be publicly released.

READ FULL TEXT

page 1

page 4

page 5

page 8

page 14

research
09/15/2022

Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models

Pre-trained vision-language models (e.g., CLIP) have shown promising zer...
research
12/05/2022

Addressing Distribution Shift at Test Time in Pre-trained Language Models

State-of-the-art pre-trained language models (PLMs) outperform other mod...
research
02/02/2023

CLIPood: Generalizing CLIP to Out-of-Distributions

Out-of-distribution (OOD) generalization, where the model needs to handl...
research
02/05/2023

Exploring Data Augmentation for Code Generation Tasks

Advances in natural language processing, such as transfer learning from ...
research
02/14/2023

BLIAM: Literature-based Data Synthesis for Synergistic Drug Combination Prediction

Language models pre-trained on scientific literature corpora have substa...
research
04/12/2020

SFE-GACN: A Novel Unknown Attack Detection Method Using Intra Categories Generation in Embedding Space

In the encrypted network traffic intrusion detection, deep learning base...
research
03/15/2022

Adversarial Counterfactual Augmentation: Application in Alzheimer's Disease Classification

Data augmentation has been widely used in deep learning to reduce over-f...

Please sign up or login with your details

Forgot password? Click here to reset