GEO-BLEU: Similarity Measure for Geospatial Sequences

12/14/2021
by   Toru Shimizu, et al.
0

In recent geospatial research, the importance of modeling large-scale human mobility data via self-supervised learning is rising, in parallel with progress in natural language processing driven by self-supervised approaches using large-scale corpora. Whereas there are already plenty of feasible approaches applicable to geospatial sequence modeling itself, there seems to be room to improve with regard to evaluation, specifically about how to measure the similarity between generated and reference sequences. In this work, we propose a novel similarity measure, GEO-BLEU, which can be especially useful in the context of geospatial sequence modeling and generation. As the name suggests, this work is based on BLEU, one of the most popular measures used in machine translation research, while introducing spatial proximity to the idea of n-gram. We compare this measure with an established baseline, dynamic time warping, applying it to actual generated geospatial sequences. Using crowdsourced annotated data on the similarity between geospatial sequences collected from over 12,000 cases, we quantitatively and qualitatively show the proposed method's superiority.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/19/2021

Integrating Unsupervised Data Generation into Self-Supervised Neural Machine Translation for Low-Resource Languages

For most language combinations, parallel data is either scarce or simply...
research
05/16/2023

Adversarial Speaker Disentanglement Using Unannotated External Data for Self-supervised Representation Based Voice Conversion

Nowadays, recognition-synthesis-based methods have been quite popular wi...
research
05/26/2023

ParaAMR: A Large-Scale Syntactically Diverse Paraphrase Dataset by AMR Back-Translation

Paraphrase generation is a long-standing task in natural language proces...
research
03/22/2016

Generating Factoid Questions With Recurrent Neural Networks: The 30M Factoid Question-Answer Corpus

Over the past decade, large-scale supervised learning corpora have enabl...
research
05/17/2023

Self-Supervised Learning for Physiologically-Based Pharmacokinetic Modeling in Dynamic PET

Dynamic positron emission tomography imaging (dPET) provides temporally ...
research
07/31/2023

SelfSeg: A Self-supervised Sub-word Segmentation Method for Neural Machine Translation

Sub-word segmentation is an essential pre-processing step for Neural Mac...
research
01/15/2020

Learning similarity measures from data

Defining similarity measures is a requirement for some machine learning ...

Please sign up or login with your details

Forgot password? Click here to reset