ChainedFilter: Combining Membership Filters by Chain Rule
Membership (membership query / membership testing) is a fundamental problem across databases, networks and security. However, previous research has primarily focused on either approximate solutions, such as Bloom Filters, or exact methods, like perfect hashing and dictionaries, without attempting to develop a an integral theory. In this paper, we propose a unified and complete theory, namely chain rule, for general membership problems, which encompasses both approximate and exact membership as extreme cases. Building upon the chain rule, we introduce a straightforward yet versatile algorithm framework, namely ChainedFilter, to combine different elementary filters without losing information. Our evaluation results demonstrate that ChainedFilter performs well in many applications: (1) it requires only 26 theoretical lower bound for implicit static dictionary, (2) it requires only 0.22 additional bit per item over the theoretical lower bound for lossless data compression, (3) it reduces up to 31 Hashing, (4) it reduces up to 36 Filter under the same space cost in RocksDB database, and (5) it reduces up to 99.1
READ FULL TEXT