Towards Enhancing In-Context Learning for Code Generation

03/31/2023
by   Jia Li, et al.
0

In-context learning (ICL) with pre-trained language models (PTLMs) has shown great success in code generation. ICL does not require training. PTLMs take as the input a prompt consisting of a few requirement-code examples and a new requirement, and output a new program. However, existing studies simply reuse ICL techniques for natural language generation and ignore unique features of code generation. We refer to these studies as standard ICL. Inspired by observations of the human coding process, we propose a novel ICL approach for code generation named AceCoder. Compared to standard ICL, AceCoder has two novelties. (1) Example retrieval. It retrieves similar programs as examples and learns programming skills (e.g., algorithms, APIs) from them. (2) Guided Code Generation. It encourages PTLMs to output an intermediate preliminary (e.g., test cases, APIs) before generating programs. The preliminary can help PTLMs understand requirements and guide the next code generation. We apply AceCoder to six PTLMs (e.g., Codex) and evaluate it on three public benchmarks using the Pass@k. Results show that AceCoder can significantly improve the performance of PTLMs on code generation. (1) In terms of Pass@1, AceCoder outperforms standard ICL by up to 79.7 models by up to 171 (e.g., 1B to 175B) and different languages (e.g., Python, Java, and JavaScript). (3) We investigate multiple choices of the intermediate preliminary. (4) We manually evaluate generated programs in three aspects and prove the superiority of AceCoder. (5) Finally, we discuss some insights about ICL for practitioners.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/11/2023

Enabling Programming Thinking in Large Language Models Toward Code Generation

Large Language Models (LLMs) (e.g., ChatGPT) have shown impressive perfo...
research
02/13/2023

SkCoder: A Sketch-based Approach for Automatic Code Generation

Recently, deep learning techniques have shown great success in automatic...
research
03/09/2023

Planning with Large Language Models for Code Generation

Existing large language model-based code generation pipelines typically ...
research
03/25/2023

Combining Contexts from Multiple Sources for Documentation-Specific Code Example Generation

Code example is a crucial part of good documentation. It helps the devel...
research
05/20/2021

Measuring Coding Challenge Competence With APPS

While programming is one of the most broadly applicable skills in modern...
research
07/28/2023

Private-Library-Oriented Code Generation with Large Language Models

Large language models (LLMs), such as Codex and GPT-4, have recently sho...

Please sign up or login with your details

Forgot password? Click here to reset