Statistical Efficiency of Travel Time Prediction
Modern mobile applications such as navigation services and ride-hailing platforms rely heavily on geospatial technologies, most critically predictions of the time required for a vehicle to traverse a particular route. Two major categories of prediction methods are segment-based approaches, which predict travel time at the level of road segments and then aggregate across the route, and route-based approaches, which use generic information about the trip such as origin and destination to predict travel time. Though various forms of these methods have been developed and used, there has been no rigorous theoretical comparison of the accuracy of these two approaches, and empirical studies have in many cases drawn opposite conclusions. We fill this gap by conducting the first theoretical analysis to compare these two approaches in terms of their predictive accuracy as a function of the sample size of the training data (the statistical efficiency). We introduce a modeling framework and formally define a family of segment-based estimators and route-based estimators that resemble many practical estimators proposed in the literature and used in practice. Under both finite sample and asymptotic settings, we give conditions under which segment-based approaches dominate their route-based counterparts. We find that although route-based approaches can avoid accumulative errors introduced by aggregating over individual road segments, such advantage is often offset by (significantly) smaller relevant sample sizes. For this reason we recommend the use of segment-based approaches if one has to make a choice between the two methods in practice. Our work highlights that the accuracy of travel time prediction is driven not just by the sophistication of the model, but also the spatial granularity at which those methods are applied.
READ FULL TEXT