EmbAssi: Embedding Assignment Costs for Similarity Search in Large Graph Databases

11/15/2021
by   Franka Bause, et al.
0

The graph edit distance is an intuitive measure to quantify the dissimilarity of graphs, but its computation is NP-hard and challenging in practice. We introduce methods for answering nearest neighbor and range queries regarding this distance efficiently for large databases with up to millions of graphs. We build on the filter-verification paradigm, where lower and upper bounds are used to reduce the number of exact computations of the graph edit distance. Highly effective bounds for this involve solving a linear assignment problem for each graph in the database, which is prohibitive in massive datasets. Index-based approaches typically provide only weak bounds leading to high computational costs verification. In this work, we derive novel lower bounds for efficient filtering from restricted assignment problems, where the cost function is a tree metric. This special case allows embedding the costs of optimal assignments isometrically into ℓ_1 space, rendering efficient indexing possible. We propose several lower bounds of the graph edit distance obtained from tree metrics reflecting the edit costs, which are combined for effective filtering. Our method termed EmbAssi can be integrated into existing filter-verification pipelines as a fast and effective pre-filtering step. Empirically we show that for many real-world graphs our lower bounds are already close to the exact graph edit distance, while our index construction and search scales to very large databases.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/04/2021

Metric Indexing for Graph Similarity Search

Finding the graphs that are most similar to a query graph in a large dat...
research
09/20/2017

Efficient Graph Edit Distance Computation and Verification via Anchor-aware Lower Bound Estimation

Graph edit distance (GED) is an important similarity measure adopted in ...
research
04/18/2019

Convex Graph Invariant Relaxations For Graph Edit Distance

The edit distance between two graphs is a widely used measure of similar...
research
06/17/2017

An Efficient Probabilistic Approach for Graph Similarity Search

Graph similarity search is a common and fundamental operation in graph d...
research
08/01/2019

New Techniques for Graph Edit Distance Computation

Due to their capacity to encode rich structural information, labeled gra...
research
06/29/2019

Upper Bounding GED via Transformations to LSAPE Based on Rings and Machine Learning

The graph edit distance (GED) is a flexible distance measure which is wi...
research
02/16/2018

Recognizing Cuneiform Signs Using Graph Based Methods

The cuneiform script constitutes one of the earliest systems of writing ...

Please sign up or login with your details

Forgot password? Click here to reset