Learning and Generalization for Matching Problems
We study a classic algorithmic problem through the lens of statistical learning. That is, we consider a matching problem where the input graph is sampled from some distribution. This distribution is unknown to the algorithm; however, an additional graph which is sampled from the same distribution is given during a training phase (preprocessing). More specifically, the algorithmic problem is to match k out of n items that arrive online to d categories (d≪ k ≪ n). Our goal is to design a two-stage online algorithm that retains a small subset of items in the first stage which contains an offline matching of maximum weight. We then compute this optimal matching in a second stage. The added statistical component is that before the online matching process begins, our algorithms learn from a training set consisting of another matching instance drawn from the same unknown distribution. Using this training set, we learn a policy that we apply during the online matching process. We consider a class of online policies that we term thresholds policies. For this class, we derive uniform convergence results both for the number of retained items and the value of the optimal matching. We show that the number of retained items and the value of the offline optimal matching deviate from their expectation by O(√(k)). This requires usage of less-standard concentration inequalities (standard ones give deviations of O(√(n))). Furthermore, we design an algorithm that outputs the optimal offline solution with high probability while retaining only O(k n) items in expectation.
READ FULL TEXT