Flesch or Fumble? Evaluating Readability Standard Alignment of Instruction-Tuned Language Models

09/11/2023
by   Joseph Marvin Imperial, et al.
0

Readability metrics and standards such as Flesch Kincaid Grade Level (FKGL) and the Common European Framework of Reference for Languages (CEFR) exist to guide teachers and educators to properly assess the complexity of educational materials before administering them for classroom use. In this study, we select a diverse set of open and closed-source instruction-tuned language models and investigate their performances in writing story completions and simplifying narratives-tasks that teachers perform-using standard-guided prompts controlling text readability. Our extensive findings provide empirical proof of how globally recognized models like ChatGPT may be considered less effective and may require more refined prompts for these generative tasks compared to other open-sourced models such as BLOOMZ and FlanT5-which have shown promising results.

READ FULL TEXT

page 5

page 7

research
06/07/2023

INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models

Instruction-tuned large language models have revolutionized natural lang...
research
04/24/2023

AMR Parsing with Instruction Fine-tuned Pre-trained Language Models

Instruction fine-tuned language models on a collection of instruction an...
research
08/25/2023

The Poison of Alignment

From the perspective of content safety issues, alignment has shown to li...
research
05/23/2023

Instruct-Align: Teaching Novel Languages with to LLMs through Alignment-based Cross-Lingual Instruction

Instruction-tuned large language models (LLMs) have shown remarkable gen...
research
08/23/2023

From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning

In the realm of Large Language Models, the balance between instruction d...
research
07/19/2023

Can Instruction Fine-Tuned Language Models Identify Social Bias through Prompting?

As the breadth and depth of language model applications continue to expa...
research
08/16/2023

Time Travel in LLMs: Tracing Data Contamination in Large Language Models

Data contamination, i.e., the presence of test data from downstream task...

Please sign up or login with your details

Forgot password? Click here to reset