Improving Code Generation by Training with Natural Language Feedback

03/28/2023
by   Angelica Chen, et al.
1

The potential for pre-trained large language models (LLMs) to use natural language feedback at inference time has been an exciting recent development. We build upon this observation by formalizing an algorithm for learning from natural language feedback at training time instead, which we call Imitation learning from Language Feedback (ILF). ILF requires only a small amount of human-written feedback during training and does not require the same feedback at test time, making it both user-friendly and sample-efficient. We further show that ILF can be seen as a form of minimizing the KL divergence to the ground truth distribution and demonstrate a proof-of-concept on a neural program synthesis task. We use ILF to improve a Codegen-Mono 6.1B model's pass@1 rate by 38 Problems (MBPP) benchmark, outperforming both fine-tuning on MBPP and fine-tuning on repaired programs written by humans. Overall, our results suggest that learning from human-written natural language feedback is both more effective and sample-efficient than training exclusively on demonstrations for improving an LLM's performance on code generation tasks.

READ FULL TEXT

page 5

page 16

research
05/17/2023

LeTI: Learning to Generate from Textual Interactions

Finetuning pre-trained language models (LMs) enhances the models' capabi...
research
06/23/2023

System-Level Natural Language Feedback

Natural language (NL) feedback contains rich information about the user ...
research
10/14/2021

Practical Benefits of Feature Feedback Under Distribution Shift

In attempts to develop sample-efficient algorithms, researcher have expl...
research
05/01/2023

Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation

Many recent advances in natural language generation have been fueled by ...
research
09/08/2022

Text-Free Learning of a Natural Language Interface for Pretrained Face Generators

We propose Fast text2StyleGAN, a natural language interface that adapts ...
research
09/12/2023

Making Network Configuration Human Friendly

This paper explores opportunities to utilize Large Language Models (LLMs...
research
07/05/2022

CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning

Program synthesis or code generation aims to generate a program that sat...

Please sign up or login with your details

Forgot password? Click here to reset