Database Matching Under Column Repetitions

02/03/2022
by   Serhat Bakirtas, et al.
0

Motivated by synchronization errors in the sampling of time-indexed databases, matching of random databases under random column repetitions (including deletions) is investigated. Column histograms are used as a permutation-invariant feature to detect the repetition pattern, whose asymptotic-uniqueness is proved using information-theoretic tools. Repetition detection is followed by a row matching scheme. Considering this overall scheme, sufficient conditions for successful database matching in terms of the database growth rate are derived. A modified version of Fano's inequality leads to a tight necessary condition for successful matching, establishing the matching capacity under column repetitions. This capacity is equal to the erasure bound, which assumes the repetition locations are known a-priori.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset