Optimal Joins using Compact Data Structures

08/05/2019
by   Gonzalo Navarro, et al.
0

Worst-case optimal join algorithms have gained a lot of attention in the database literature. We now count with several different algorithms that have all been shown to be optimal in the worst case, and many of them have also been implemented and tested in practice. However, the implementation of these algorithms often requires an enhanced indexing structure: to achieve optimality we either need to build completely new indexes or we must populate the database with several different instantiations of common indexes such as B+-trees. Either way, this means spending an extra amount of storage space that may be non-negligible. In this paper we show that optimal algorithms can be obtained directly from a representation that regards the relations as point sets in variable-dimensional grids, without the need of extra storage. Our representation is a compact quadtree for the static indexes and a dynamic quadtree sharing subtrees (which we dub a Qdag) for intermediate results. We develop a compositional algorithm to process full join queries when data is stored in said structures, and then show that the running time of this algorithm is worst-case optimal in data complexity. Remarkably, we can even extend our framework to compute more expressive queries in relational algebra using both unions and a form of limited negation, by introducing a lazy version of Qdags. Once again, we can show that the running time of our algorithms is worst-case optimal.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset