Exponential Weights on the Hypercube in Polynomial Time

06/12/2018
by   Sudeep Raja Putta, et al.
0

We address the online linear optimization problem when the decision set is the entire {0,1}^n hypercube. It was previously unknown if it is possible to run the exponential weights algorithm on this decision set in polynomial time. In this paper, we show a simple polynomial time algorithm which is equivalent to running exponential weights on {0,1}^n. In the Full Information setting, we show that our algorithm is equivalent to both Exp2 and Online Mirror Descent with Entropic Regularization. This enables us to prove a tight regret bound for Exp2 on {0,1}^n. In the Bandit setting, we show that our algorithm is equivalent to both Exp2 and OMD with Entropic Regularization as long as they use the same exploration distribution. In addition, we show a reduction from the {-1,+1}^n hypercube to the {0,1}^n hypercube for the full information and bandit settings. This implies that we can also run exponential weights on {-1,+1}^n in polynomial time, addressing the problem of sampling from the exponential weights distribution in polynomial time, which was left as an open question in Bubeck et al. (2012).

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset