Statistical inference of probabilistic origin-destination demand using day-to-day traffic data
Recent transportation network studies on uncertainty and reliability call for modeling the probabilistic O-D demand and probabilistic network flow. Making the best use of day-to-day traffic data collected over many years, this paper develops a novel theoretical framework for estimating the mean and variance/covariance matrix of O-D demand considering the day-to-day variation induced by travelers' independent route choices. It also estimates the probability distributions of link/path flow and their travel cost where the variance stems from three sources, O-D demand, route choice and unknown errors. The framework estimates O-D demand mean and variance/covariance matrix iteratively, also known as iterative generalized least squares (IGLS) in statistics. Lasso regularization is employed to obtain sparse covariance matrix for better interpretation and computational efficiency. Though the probabilistic O-D estimation (ODE) works with a much larger solution space than the deterministic ODE, we show that its estimator for O-D demand mean is no worse than the best possible estimator by an error that reduces with the increase in sample size. The probabilistic ODE is examined on two small networks and two real-world large-scale networks. The solution converges quickly under the IGLS framework. In all those experiments, the results of the probabilistic ODE are compelling, satisfactory and computationally plausible. Lasso regularization on the covariance matrix estimation leans to underestimate most of variance/covariance entries. A proper Lasso penalty ensures a good trade-off between bias and variance of the estimation.
READ FULL TEXT