A Frequent Itemset Hiding Toolbox
Advances in data collection and data storage technologies have given way to the establishment of transactional databases among companies and organizations, as they allow enormous amounts of data to be stored efficiently. Useful knowledge can be mined from these data, which can be used in several ways depending on the nature of the data. Quite often companies and organizations are willing to share data for the sake of mutual benefit. However, the sharing of such data comes with risks, as problems with privacy may arise. Sensitive data, along with sensitive knowledge inferred from this data, must be protected from unintentional exposure to unauthorized parties. One form of the inferred knowledge is frequent patterns mined in the form of frequent itemsets from transactional databases. The problem of protecting such patterns is known as the frequent itemset hiding problem. In this paper we present a toolbox, which provides several implementations of frequent itemset hiding algorithms. Firstly, we summarize the most important aspects of each algorithm. We then introduce the architecture of the toolbox and its novel features. Finally, we provide experimental results on real world datasets, demonstrating the efficiency of the toolbox and the convenience it offers in comparing different algorithms.
READ FULL TEXT