BoostNSift: A Query Boosting and Code Sifting Technique for Method Level Bug Localization
Locating bugs is an important, but effort-intensive and time-consuming task, when dealing with large-scale systems. To address this, Information Retrieval (IR) techniques are increasingly being used to suggest potential buggy source code locations, for given bug reports. While IR techniques are very scalable, in practice their effectiveness in accurately localizing bugs in a software system remains low. Results of empirical studies suggest that the effectiveness of bug localization techniques can be augmented by the configuration of queries used to locate buggy code. However, in most IR-based bug localization techniques, presented by researchers, the impact of the queries' configurations is not fully considered. In a similar vein, techniques consider all code elements as equally suspicious of being buggy while localizing bugs, but this is not always the case either.In this paper, we present a new method-level, information-retrieval-based bug localization technique called “BoostNSift”. BoostNSift exploits the important information in queries by `boost'ing that information, and then `sift's the identified code elements, based on a novel technique that emphasizes the code elements' specific relatedness to a bug report over its generic relatedness to all bug reports. To evaluate the performance of BoostNSift, we employed a state-of-the-art empirical design that has been commonly used for evaluating file level IR-based bug localization techniques: 6851 bugs are selected from commonly used Eclipse, AspectJ, SWT, and ZXing benchmarks and made openly available for method-level analyses.
READ FULL TEXT