InCoder: A Generative Model for Code Infilling and Synthesis

04/12/2022
by   Daniel Fried, et al.
6

Code is seldom written in a single left-to-right pass and is instead repeatedly edited and refined. We introduce InCoder, a unified generative model that can perform program synthesis (via left-to-right generation) as well as editing (via infilling). InCoder is trained to generate code files from a large corpus of permissively licensed code, where regions of code have been randomly masked and moved to the end of each file, allowing code infilling with bidirectional context. Our model is the first generative model that is able to directly perform zero-shot code infilling, which we evaluate on challenging tasks such as type inference, comment generation, and variable re-naming. We find that the ability to condition on bidirectional context substantially improves performance on these tasks, while still performing comparably on standard program synthesis benchmarks in comparison to left-to-right only models pretrained at similar scale. The InCoder models and code are publicly released. https://sites.google.com/view/incoder-code-models

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/23/2023

Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale

Large-scale generative models such as GPT and DALL-E have revolutionized...
research
10/27/2020

Fast Interleaved Bidirectional Sequence Generation

Independence assumptions during sequence generation can speed up inferen...
research
05/24/2022

Learning to Model Editing Processes

Most existing sequence generation models produce outputs in one pass, us...
research
02/24/2019

Synchronous Bidirectional Inference for Neural Sequence Generation

In sequence to sequence generation tasks (e.g. machine translation and a...
research
01/19/2022

CM3: A Causal Masked Multimodal Model of the Internet

We introduce CM3, a family of causally masked generative models trained ...
research
01/09/2023

SantaCoder: don't reach for the stars!

The BigCode project is an open-scientific collaboration working on the r...
research
03/01/2023

R-U-SURE? Uncertainty-Aware Code Suggestions By Maximizing Utility Across Random User Intents

Large language models show impressive results at predicting structured t...

Please sign up or login with your details

Forgot password? Click here to reset