Generative Type Inference for Python

by   Yun Peng, et al.

Python is a popular dynamic programming language, evidenced by its ranking as the second most commonly used language on GitHub. However, its dynamic type system can lead to potential type errors, leading researchers to explore automatic type inference approaches for Python programs. The rule-based type inference approaches can ensure the accuracy of predicted variable types, but they suffer from low coverage problems. Supervised type inference approaches, while feature-agnostic, require large, high-quality annotated datasets and are limited to pre-defined types. As zero-shot approaches, the cloze-style approaches reformulate the type inference problem into a fill-in-the-blank problem. However, their performance is limited. This paper introduces TypeGen, a few-shot generative type inference approach that incorporates static domain knowledge from static analysis. TypeGen creates chain-of-thought (COT) prompts by translating the type inference steps of static analysis into prompts based on the type dependency graphs (TDGs), enabling language models to learn from how static analysis infers types. By combining COT prompts with code slices and type hints, TypeGen constructs example prompts from human annotations. TypeGen only requires very few annotated examples to teach language models to generate similar COT prompts via in-context learning. Moreover, TypeGen enhances the interpretability of results through the use of the input-explanation-output strategy. Experiments show that TypeGen outperforms the best baseline Type4Py by 10.0 prediction and 22.5 Match by using only five examples. Furthermore, TypeGen achieves substantial improvements of 27 language models with parameter sizes ranging from 1.3B to 175B in terms of top-1 Exact Match.


TypeWriter: Neural Type Prediction with Search-based Validation

Maintaining large code bases written in dynamically typed languages, suc...

TypeT5: Seq2seq Type Inference using Static Analysis

There has been growing interest in automatically predicting missing type...

Static Analysis for AWS Best Practices in Python Code

Amazon Web Services (AWS) is a comprehensive and broadly adopted cloud p...

Domain Knowledge Matters: Improving Prompts with Fix Templates for Repairing Python Type Errors

Although the dynamic type system of Python facilitates the developers in...

Type4Py: Deep Similarity Learning-Based Type Inference for Python

Dynamic languages, such as Python and Javascript, trade static typing fo...

DLTPy: Deep Learning Type Inference of Python Function Signatures using Natural Language Context

Due to the rise of machine learning, Python is an increasingly popular p...

Gradual Liquid Type Inference

Liquid typing provides a decidable refinement inference mechanism that i...

Please sign up or login with your details

Forgot password? Click here to reset