Supervised Tree-Wasserstein Distance

01/27/2021
by   Yuki Takezawa, et al.
0

To measure the similarity of documents, the Wasserstein distance is a powerful tool, but it requires a high computational cost. Recently, for fast computation of the Wasserstein distance, methods for approximating the Wasserstein distance using a tree metric have been proposed. These tree-based methods allow fast comparisons of a large number of documents; however, they are unsupervised and do not learn task-specific distances. In this work, we propose the Supervised Tree-Wasserstein (STW) distance, a fast, supervised metric learning method based on the tree metric. Specifically, we rewrite the Wasserstein distance on the tree metric by the parent-child relationships of a tree, and formulate it as a continuous optimization problem using a contrastive loss. Experimentally, we show that the STW distance can be computed fast, and improves the accuracy of document classification tasks. Furthermore, the STW distance is formulated by matrix multiplications, runs on a GPU, and is suitable for batch processing. Therefore, we show that the STW distance is extremely efficient when comparing a large number of documents.

READ FULL TEXT
research
09/08/2021

Fixed Support Tree-Sliced Wasserstein Barycenter

The Wasserstein barycenter has been widely studied in various fields, in...
research
04/23/2019

Wasserstein-Fisher-Rao Document Distance

As a fundamental problem of natural language processing, it is important...
research
06/24/2022

Approximating 1-Wasserstein Distance with Trees

Wasserstein distance, which measures the discrepancy between distributio...
research
08/21/2021

Metric Learning in Multilingual Sentence Similarity Measurement for Document Alignment

Document alignment techniques based on multilingual sentence representat...
research
10/20/2020

Wasserstein K-Means for Clustering Tomographic Projections

Motivated by the 2D class averaging problem in single-particle cryo-elec...
research
05/26/2022

Efficient Approximation of Gromov-Wasserstein Distance using Importance Sparsification

As a valid metric of metric-measure spaces, Gromov-Wasserstein (GW) dist...
research
03/01/2021

Computationally Efficient Wasserstein Loss for Structured Labels

The problem of estimating the probability distribution of labels has bee...

Please sign up or login with your details

Forgot password? Click here to reset