BenchPress: Analyzing Android App Vulnerability Benchmark Suites

by   Joydeep Mitra, et al.

In recent years, various efforts have designed and developed benchmark suites to evaluate the efficacy of vulnerability detection tools in Android apps. The choice of benchmark suites used in tool evaluations is often based on availability and popularity of suites instead of on the characteristics and relevance of benchmark suites relative to real world native Android apps. One of the reasons for such choice is the lack of information about characteristics and relevance of benchmarks suites relative to real world apps. In this paper, we report the findings from our effort aimed at addressing this gap. We empirically evaluated three Android specific benchmark suites: DroidBench, Ghera, and IccBench. For each suite, we report how well do these benchmark suites represent real world apps in terms of API usage: 1) coverage: how often are the APIs used in a benchmark suite used in a sample of real world native Android apps? and 2) gap: which of the APIs used in a sample of real world native Android apps are not used in any benchmark suite? Based on pairwise comparison, we also report how these suites fare relative to each other in terms of API usage. The findings in this paper can help 1) Android security analysis tool developers choose benchmark suites that are best suited to evaluate their tools (informed by coverage and pairwise comparison) and 2) Android specific benchmark creators improve API usage based representativeness of suites (informed by gaps).


