Extracting Trips from Multi-Sourced Data for Mobility Pattern Analysis: An App-Based Data Example
Passively-generated data, such as GPS data and cellular data, bring tremendous opportunities for human mobility analysis and transportation applications. Since their primary purposes are often non-transportation related, the passively-generated data need to be processed to extract trips. Most existing trip extraction methods rely on data that are generated via a single positioning technology such as GPS or triangulation through cellular towers (thereby called single-sourced data), and methods to extract trips from data generated via multiple positioning technologies (or, multi-sourced data) are absent. And yet, multi-sourced data are now increasingly common. Generated using multiple technologies (e.g., GPS, cellular network- and WiFi-based), multi-sourced data contain high variances in their temporal and spatial properties. In this study, we propose a 'Divide, Conquer and Integrate' (DCI) framework to extract trips from multi-sourced data. We evaluate the proposed framework by applying it to an app-based data, which is multi-sourced and has high variances in both location accuracy and observation interval (i.e. time interval between two consecutive observations). On a manually labeled sample of the app-based data, the framework outperforms the state-of-the-art SVM model that is designed for GPS data. The effectiveness of the framework is also illustrated by consistent mobility patterns obtained from the app-based data and an externally collected household travel survey data for the same region and the same period.
READ FULL TEXT