Hash Adaptive Bloom Filter

06/13/2021
by   Rongbiao Xie, et al.
0

Bloom filter is a compact memory-efficient probabilistic data structure supporting membership testing, i.e., to check whether an element is in a given set. However, as Bloom filter maps each element with uniformly random hash functions, few flexibilities are provided even if the information of negative keys (elements are not in the set) are available. The problem gets worse when the misidentification of negative keys brings different costs. To address the above problems, we propose a new Hash Adaptive Bloom Filter (HABF) that supports the customization of hash functions for keys. The key idea of HABF is to customize the hash functions for positive keys (elements are in the set) to avoid negative keys with high cost, and pack customized hash functions into a lightweight data structure named HashExpressor. Then, given an element at query time, HABF follows a two-round pattern to check whether the element is in the set. Further, we theoretically analyze the performance of HABF and bound the expected false positive rate. We conduct extensive experiments on representative datasets, and the results show that HABF outperforms the standard Bloom filter and its cutting-edge variants on the whole in terms of accuracy, construction time, query time, and memory space consumption (Note that source codes are available in [1]).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/17/2019

Shed More Light on Bloom Filter's Variants

Bloom Filter is a probabilistic membership data structure and it is exce...
research
06/11/2023

Time-limited Bloom Filter

A Bloom Filter is a probabilistic data structure designed to check, rapi...
research
06/06/2021

robustBF: A High Accuracy and Memory Efficient 2D Bloom Filter

Bloom Filter is an important probabilistic data structure to reduce memo...
research
06/06/2021

countBF: A General-purpose High Accuracy and Space Efficient Counting Bloom Filter

Bloom Filter is a probabilistic data structure for the membership query,...
research
01/07/2019

Bloom Multifilters for Multiple Set Matching

Bloom filter is a space-efficient probabilistic data structure for check...
research
01/07/2019

Multiple Set Matching and Pre-Filtering with Bloom Multifilters

Bloom filter is a space-efficient probabilistic data structure for check...
research
02/08/2020

The Bloom Tree

The Bloom tree is a probabilistic data structure that combines the idea ...

Please sign up or login with your details

Forgot password? Click here to reset