Passenger Path Choice Estimation Using Smart Card Data: A Latent Class Approach with Panel Effects Across Days
Understanding passengers' path choice behavior in urban rail systems is a prerequisite for effective operations and planning. This paper attempts bridging the gap by proposing a probabilistic approach to infer passengers' path choice behavior in urban rail systems using a large-scale smart card data. The model uses latent classes and panel effects to capture passengers' implicit behavior heterogeneity and longitudinal correlations, key research gaps in big data driven behavior studies. We formulate the probability of each individual's arrival time at a destination based on their path choice behavior, and estimate corresponding path choice model parameters as a maximum likelihood estimation problem. The original likelihood function is intractable due to the exponential computation complexity. We derive a tractable likelihood function and propose a numerical integral approach to efficiently estimate the model. Also, we propose a method to calculate the t-statistic of the estimated choice parameters based on the numerically estimated Hessian matrix and Cramer-Rao bound (the lower bound on the coefficient variance). Case studies using synthetic data validate the model performance and its robustness against parameter initialization and input errors, and highlight the importance of incorporating crowding impact in path choice estimation. Applications using actual data from the Mass Transit Railway, Hong Kong reveal two latent groups of passengers: time-sensitive (TS) and comfort-aware (CA). TS passengers are those who are more likely to choose paths with short travel times. Most of them are regular commuters with high travel frequency and less schedule flexibility. CA passengers care more about the travel comfort experience and choose paths with less walking and waiting times. The proposed approach is data-driven and general to accommodate other discrete choice structures.
READ FULL TEXT