Gromov-Wasserstein Distances: Entropic Regularization, Duality, and Sample Complexity
The Gromov-Wasserstein (GW) distance quantifies dissimilarity between metric measure spaces and provides a meaningful figure of merit for applications involving heterogeneous data. While computational aspects of the GW distance have been widely studied, a strong duality theory and fundamental statistical questions concerning empirical convergence rates remained obscure. This work closes these gaps for the (2,2)-GW distance (namely, with quadratic cost) over Euclidean spaces of different dimensions d_x and d_y. We consider both the standard GW and the entropic GW (EGW) distances, derive their dual forms, and use them to analyze expected empirical convergence rates. The resulting rates are n^-2/max{d_x,d_y,4} (up to a log factor when max{d_x,d_y}=4) and n^-1/2 for the two-sample GW and EGW problems, respectively, which matches the corresponding rates for standard and entropic optimal transport distances. We also study stability of EGW in the entropic regularization parameter and establish approximation and continuity results for the cost and optimal couplings. Lastly, the duality is leveraged to shed new light on the open problem of the one-dimensional GW distance between uniform distributions on n points, illuminating why the identity and anti-identity permutations may not be optimal. Our results serve as a first step towards a comprehensive statistical theory as well as computational advancements for GW distances, based on the discovered dual formulation.
READ FULL TEXT