LaDe: The First Comprehensive Last-mile Delivery Dataset from Industry

by   Lixia Wu, et al.

Real-world last-mile delivery datasets are crucial for research in logistics, supply chain management, and spatio-temporal data mining. Despite a plethora of algorithms developed to date, no widely accepted, publicly available last-mile delivery dataset exists to support research in this field. In this paper, we introduce , the first publicly available last-mile delivery dataset with millions of packages from the industry. LaDe has three unique characteristics: (1) Large-scale. It involves 10,677k packages of 21k couriers over 6 months of real-world operation. (2) Comprehensive information. It offers original package information, such as its location and time requirements, as well as task-event information, which records when and where the courier is while events such as task-accept and task-finish events happen. (3) Diversity. The dataset includes data from various scenarios, including package pick-up and delivery, and from multiple cities, each with its unique spatio-temporal patterns due to their distinct characteristics such as populations. We verify LaDe on three tasks by running several classical baseline models per task. We believe that the large-scale, comprehensive, diverse feature of LaDe can offer unparalleled opportunities to researchers in the supply chain community, data mining community, and beyond. The dataset homepage is publicly available at


page 1

page 2

page 3

page 4


Spatio-Temporal Data Mining: A Survey of Problems and Methods

Large volumes of spatio-temporal data are increasingly collected and stu...

Challenges and opportunities in applying Neural Temporal Point Processes to large scale industry data

In this work, we identify open research opportunities in applying Neural...

Deep Learning for Spatio-Temporal Data Mining: A Survey

With the fast development of various positioning techniques such as Glob...

MegaCRN: Meta-Graph Convolutional Recurrent Network for Spatio-Temporal Modeling

Spatio-temporal modeling as a canonical task of multivariate time series...

Delivery Issues Identification from Customer Feedback Data

Millions of packages are delivered successfully by online and local reta...

A Countrywide Traffic Accident Dataset

Reducing traffic accidents is an important public safety challenge. Howe...

A Multimodal Machine Learning Framework for Teacher Vocal Delivery Evaluation

The quality of vocal delivery is one of the key indicators for evaluatin...

Please sign up or login with your details

Forgot password? Click here to reset