An Analysis of GPT-3's Performance in Grammatical Error Correction

03/25/2023
by   Steven Coyne, et al.
0

GPT-3 models are very powerful, achieving high performance on a variety of natural language processing tasks. However, there is a relative lack of detailed published analysis on how well they perform on the task of grammatical error correction (GEC). To address this, we perform experiments testing the capabilities of a GPT-3 model (text-davinci-003) against major GEC benchmarks, comparing the performance of several different prompts, including a comparison of zero-shot and few-shot settings. We analyze intriguing or problematic outputs encountered with different prompt formats. We report the performance of our best prompt on the BEA-2019 and JFLEG datasets using a combination of automatic metrics and human evaluations, revealing interesting differences between the preferences of human raters and the reference-based automatic metrics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/09/2023

Can Generative Large Language Models Perform ASR Error Correction?

ASR error correction continues to serve as an important part of post-pro...
research
04/04/2023

Is ChatGPT a Highly Fluent Grammatical Error Correction System? A Comprehensive Evaluation

ChatGPT, a large-scale language model based on the advanced GPT-3.5 arch...
research
05/13/2023

Zero-shot Faithful Factual Error Correction

Faithfully correcting factual errors is critical for maintaining the int...
research
08/17/2023

Evaluation of really good grammatical error correction

Although rarely stated, in practice, Grammatical Error Correction (GEC) ...
research
01/20/2022

Construction of a Quality Estimation Dataset for Automatic Evaluation of Japanese Grammatical Error Correction

In grammatical error correction (GEC), automatic evaluation is an import...
research
05/29/2023

Exploring Effectiveness of GPT-3 in Grammatical Error Correction: A Study on Performance and Controllability in Prompt-Based Methods

Large-scale pre-trained language models such as GPT-3 have shown remarka...
research
06/08/2023

Reference Matters: Benchmarking Factual Error Correction for Dialogue Summarization with Fine-grained Evaluation Framework

Factuality is important to dialogue summarization. Factual error correct...

Please sign up or login with your details

Forgot password? Click here to reset