# Beating the probabilistic lower bound on perfect hashing

For an integer $q\ge 2$, a perfect $q$-hash code $C$ is a block code over $\mathbb{Z}_q:= \mathbb{Z}/ q\mathbb{Z}$ of length $n$ in which every subset $\{c_1, c_2, \dots, c_q\}$ of $q$ elements is separated, i.e., there exists $i\in[n]$ such that $\{\mathrm{proj}_i(c_1), \mathrm{proj}_i(c_2), \dots, \mathrm{proj}_i(c_q)\}= \mathbb{Z}_q$, where $\mathrm{proj}_i(c_j)$ denotes the $i$th position of $c_j$. Finding the maximum size $M(n,q)$ of perfect $q$-hash codes of length $n$, for given $q$ and $n$, is a fundamental problem in combinatorics, information theory, and computer science. In this paper, we are interested in asymptotical behavior of this problem. More precisely speaking, we will focus on the quantity $R_q := \limsup_{n\rightarrow\infty} \frac{\log_2 M(n,q)}n$. A well-known probabilistic argument shows an existence lower bound on $R_q$, namely $R_q\ge\frac1{q-1}\log_2\left(\frac1{1-q!/q^q}\right)$ [8,10]. This is still the best-known lower bound till now except for the case $q=3$ for which K\"{o}rner and Matron [11] found that the concatenation technique could lead to a perfect $3$-hash code beating this the probabilistic lower bound. The improvement on the lower bound on $R_3$ was discovered in 1988 and there has been no any progress on lower bound on $R_q$ for more than 30 years despite of some work on upper bounds on $R_q$. In this paper we show that this probabilistic lower bound can be improved for $q=4,8$ and all odd integers between $3$ and $25$, and \emph{every sufficiently large odd} $q$. Our idea is based on a modified concatenation which is different from the classical concatenation for which both the inner and outer codes are separated. However, for our concatenation we do not require that the inner code is a perfect $q$-hash code. This gives a more flexible choice of inner codes and hence we are able to beat the probabilistic existence lower bound on $R_q$.

READ FULL TEXT