The Forgotten Role of Search Queries in IR-based Bug Localization: An Empirical Study

08/11/2021
by   Mohammad Masudur Rahman, et al.
0

Being light-weight and cost-effective, IR-based approaches for bug localization have shown promise in finding software bugs. However, the accuracy of these approaches heavily depends on their used bug reports. A significant number of bug reports contain only plain natural language texts. According to existing studies, IR-based approaches cannot perform well when they use these bug reports as search queries. On the other hand, there is a piece of recent evidence that suggests that even these natural language-only reports contain enough good keywords that could help localize the bugs successfully. On one hand, these findings suggest that natural language-only bug reports might be a sufficient source for good query keywords. On the other hand, they cast serious doubt on the query selection practices in the IR-based bug localization. In this article, we attempted to clear the sky on this aspect by conducting an in-depth empirical study that critically examines the state-of-the-art query selection practices in IR-based bug localization. In particular, we use a dataset of 2,320 bug reports, employ ten existing approaches from the literature, exploit the Genetic Algorithm-based approach to construct optimal, near-optimal search queries from these bug reports, and then answer three research questions. We confirmed that the state-of-the-art query construction approaches are indeed not sufficient for constructing appropriate queries (for bug localization) from certain natural language-only bug reports although they contain such queries. We also demonstrate that optimal queries and non-optimal queries chosen from bug report texts are significantly different in terms of several keyword characteristics, which has led us to actionable insights. Furthermore, we demonstrate 27 non-optimal queries through the application of our actionable insights to them.

READ FULL TEXT

page 22

page 28

page 36

research
07/20/2018

Poster: Improving Bug Localization with Report Quality Dynamics and Query Reformulation

Recent findings from a user study suggest that IR-based bug localization...
research
08/01/2018

Improving IR-Based Bug Localization with Context-Aware Query Reformulation

Recent findings suggest that Information Retrieval (IR)-based bug locali...
research
02/27/2018

Network-Clustered Multi-Modal Bug Localization

Developers often spend much effort and resources to debug a program. To ...
research
01/18/2023

Automatically Reproducing Android Bug Reports Using Natural Language Processing and Reinforcement Learning

As part of the process of resolving issues submitted by users via bug re...
research
10/20/2020

Industry-scale IR-based Bug Localization: A Perspective from Facebook

We explore the application of Information Retrieval (IR) based bug local...
research
03/06/2020

Memory-Safety Challenge Considered Solved? An Empirical Study with All Rust CVEs

Rust is an emerging programing language that aims at preventing memory-s...
research
12/02/2022

CLeBPI: Contrastive Learning for Bug Priority Inference

Automated bug priority inference can reduce the time overhead of bug tri...

Please sign up or login with your details

Forgot password? Click here to reset