How We Refactor and How We Document it? On the Use of Supervised Machine Learning Algorithms to Classify Refactoring Documentation

10/26/2020
by   Eman Abdullah AlOmar, et al.
28

Refactoring is the art of improving the design of a system without altering its external behavior. Refactoring has become a well established and disciplined software engineering practice that has attracted a significant amount of research presuming that refactoring is primarily motivated by the need to improve system structures. However, recent studies have shown that developers may incorporate refactorings in other development activities that go beyond improving the design. Unfortunately, these studies are limited to developer interviews and a reduced set of projects. To cope with the above-mentioned limitations, we aim to better understand what motivates developers to apply refactoring by mining and classifying a large set of 111,884 commits containing refactorings, extracted from 800 Java projects. We trained a multi-class classifier to categorize these commits into 3 categories, namely, Internal QA, External QA, and Code Smell Resolution, along with the traditional BugFix and Functional categories. This classification challenges the original definition of refactoring, being exclusive to improving the design and fixing code smells. Further, to better understand our classification results, we analyzed commit messages to extract textual patterns that developers regularly use to describe their refactorings. The results show that (1) fixing code smells is not the main driver for developers to refactoring their codebases. Refactoring is solicited for a wide variety of reasons, going beyond its traditional definition; (2) the distribution of refactorings differs between production and test files; (3) developers use several patterns to purposefully target refactoring; (4) the textual patterns, extracted from commit messages, provide better coverage for how developers document their refactorings.

READ FULL TEXT

page 22

page 28

page 41

research
03/19/2022

An Exploratory Study on Refactoring Documentation in Issues Handling

Understanding the practice of refactoring documentation is of paramount ...
research
09/19/2020

Toward the Automatic Classification of Self-Affirmed Refactoring

The concept of Self-Affirmed Refactoring (SAR) was introduced to explore...
research
05/30/2019

A large-scale, in-depth analysis of developers' personalities in the Apache ecosystem

Context: Large-scale distributed projects are typically the results of c...
research
09/02/2022

How Developers Extract Functions: An Experiment

Creating functions is at the center of writing computer programs. But th...
research
12/02/2021

On the Documentation of Refactoring Types

Commit messages are the atomic level of software documentation. They pro...
research
03/01/2023

An Exploratory Study on the Usage and Readability of Messages Within Assertion Methods of Test Cases

Unit testing is a vital part of the software development process and inv...
research
03/24/2023

Testability Refactoring in Pull Requests: Patterns and Trends

To create unit tests, it may be necessary to refactor the production cod...

Please sign up or login with your details

Forgot password? Click here to reset