Online Learning and Optimization for Queues with Unknown Demand Curve and Service Distribution

by   Xinyun Chen, et al.

We investigate an optimization problem in a queueing system where the service provider selects the optimal service fee p and service capacity μto maximize the cumulative expected profit (the service revenue minus the capacity cost and delay penalty). The conventional predict-then-optimize (PTO) approach takes two steps: first, it estimates the model parameters (e.g., arrival rate and service-time distribution) from data; second, it optimizes a model based on the estimated parameters. A major drawback of PTO is that its solution accuracy can often be highly sensitive to the parameter estimation errors because PTO is unable to properly link these errors (step 1) to the quality of the optimized solutions (step 2). To remedy this issue, we develop an online learning framework that automatically incorporates the aforementioned parameter estimation errors in the solution prescription process; it is an integrated method that can "learn" the optimal solution without needing to set up the parameter estimation as a separate step as in PTO. Effectiveness of our online learning approach is substantiated by (i) theoretical results including the algorithm convergence and analysis of the regret ("cost" to pay over time for the algorithm to learn the optimal policy), and (ii) engineering confirmation via simulation experiments of a variety of representative examples. We also provide careful comparisons for PTO and the online learning method.


page 1

page 2

page 3

page 4


An online learning approach to dynamic pricing and capacity sizing in service systems

We study a dynamic pricing and capacity sizing problem in a GI/GI/1 queu...

Optimal Parameter-free Online Learning with Switching Cost

Parameter-freeness in online learning refers to the adaptivity of an alg...

Finite Sample Guarantees for Distributed Online Parameter Estimation with Communication Costs

We study the problem of estimating an unknown parameter in a distributed...

An Online Learning Framework for Energy-Efficient Navigation of Electric Vehicles

Energy-efficient navigation constitutes an important challenge in electr...

Online Network Source Optimization with Graph-Kernel MAB

We propose Grab-UCB, a graph-kernel multi-arms bandit algorithm to learn...

Predictor-Corrector Policy Optimization

We present a predictor-corrector framework, called PicCoLO, that can tra...

Towards Shockingly Easy Structured Classification: A Search-based Probabilistic Online Learning Framework

There are two major approaches for structured classification. One is the...

Please sign up or login with your details

Forgot password? Click here to reset