Cost-aware Vulnerability Prediction: the HARMLESS Approach
Society needs more secure software. But predicting vulnerabilities is difficult and existing methods are not applied in practical use due to various limitations. The goal of this paper is to design a vulnerability prediction method in a cost-aware manner so that it can balance the percentage of vulnerabilities found against the cost of human effort on security review and test. To this purpose, this paper presents HARMLESS, an incremental vulnerability prediction tool. HARMLESS is an active learner that (a) builds a support vector machine on the source code files reviewed to date; then (b) suggests what other source code files might have vulnerabilities and need to be reviewed next. A unique feature of HARMLESS is that HARMLESS can estimate the number of remaining vulnerabilities. To the best of our knowledge, HARMLESS is the first tool providing such estimation in the arena of vulnerability prediction. Using that estimator, HARMLESS can guide the security review and test to any level of desired recall, i.e. percentage of vulnerabilities found. In experiments on a case study of Mozilla Firefox project, HARMLESS found 90, 95, 99 source code files, respectively.
READ FULL TEXT