Sensitivity and Robustness of Large Language Models to Prompt in Japanese

05/15/2023
by   Chengguang Gan, et al.
0

Prompt Engineering has gained significant relevance in recent years, fueled by advancements in pre-trained and large language models. However, a critical issue has been identified within this domain: the lack of sensitivity and robustness of these models towards Prompt Templates, particularly in lesser-studied languages such as Japanese. This paper explores this issue through a comprehensive evaluation of several representative Large Language Models (LLMs) and a widely-utilized pre-trained model(PLM), T5. These models are scrutinized using a benchmark dataset in Japanese, with the aim to assess and analyze the performance of the current multilingual models in this context. Our experimental results reveal startling discrepancies. A simple modification in the sentence structure of the Prompt Template led to a drastic drop in the accuracy of GPT-4 from 49.21 to 25.44. This observation underscores the fact that even the highly performance GPT-4 model encounters significant stability issues when dealing with diverse Japanese prompt templates, rendering the consistency of the model's output results questionable. In light of these findings, we conclude by proposing potential research trajectories to further enhance the development and performance of Large Language Models in their current stage.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/02/2023

Better Language Models of Code through Self-Improvement

Pre-trained language models for code (PLMCs) have gained attention in re...
research
08/09/2022

Compositional Evaluation on Japanese Textual Entailment and Similarity

Natural Language Inference (NLI) and Semantic Textual Similarity (STS) a...
research
06/19/2023

Jamp: Controlled Japanese Temporal Inference Dataset for Evaluating Generalization Capacity of Language Models

Natural Language Inference (NLI) tasks involving temporal inference rema...
research
07/24/2023

A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models

Prompt engineering is a technique that involves augmenting a large pre-t...
research
12/20/2022

mFACE: Multilingual Summarization with Factual Consistency Evaluation

Abstractive summarization has enjoyed renewed interest in recent years, ...
research
03/04/2023

Could a Large Language Model be Conscious?

There has recently been widespread discussion of whether large language ...
research
06/21/2023

Opening the Black Box: Analyzing Attention Weights and Hidden States in Pre-trained Language Models for Non-language Tasks

Investigating deep learning language models has always been a significan...

Please sign up or login with your details

Forgot password? Click here to reset