Automated Variable Renaming: Are We There Yet?

12/12/2022
by   Antonio Mastropaolo, et al.
0

Identifiers, such as method and variable names, form a large portion of source code. Therefore, low-quality identifiers can substantially hinder code comprehension. To support developers in using meaningful identifiers, several (semi-)automatic techniques have been proposed, mostly being data-driven (e.g. statistical language models, deep learning models) or relying on static code analysis. Still, limited empirical investigations have been performed on the effectiveness of such techniques for recommending developers with meaningful identifiers, possibly resulting in rename refactoring operations. We present a large-scale study investigating the potential of data-driven approaches to support automated variable renaming. We experiment with three state-of-the-art techniques: a statistical language model and two DL-based models. The three approaches have been trained and tested on three datasets we built with the goal of evaluating their ability to recommend meaningful variable identifiers. Our quantitative and qualitative analyses show the potential of such techniques that, under specific conditions, can provide valuable recommendations and are ready to be integrated in rename refactoring tools. Nonetheless, our results also highlight limitations of the experimented approaches that call for further research in this field.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/12/2019

Mercem: Method Name Recommendation Based on Call Graph Embedding

Comprehensibility of source code is strongly affected by identifier name...
research
07/22/2021

An Empirical Study on Code Comment Completion

Code comments play a prominent role in program comprehension activities....
research
02/13/2020

Deep Learning for Source Code Modeling and Generation: Models, Applications and Challenges

Deep Learning (DL) techniques for Natural Language Processing have been ...
research
08/13/2018

Automated Refactoring: Can They Pass The Turing Test?

Refactoring is a maintenance activity that aims to improve design qualit...
research
08/08/2023

DataTales: Investigating the use of Large Language Models for Authoring Data-Driven Articles

Authoring data-driven articles is a complex process requiring authors to...
research
03/23/2021

Variable Name Recovery in Decompiled Binary Code using Constrained Masked Language Modeling

Decompilation is the procedure of transforming binary programs into a hi...
research
07/25/2022

A Hazard Analysis Framework for Code Synthesis Large Language Models

Codex, a large language model (LLM) trained on a variety of codebases, e...

Please sign up or login with your details

Forgot password? Click here to reset