The Log-Concave Maximum Likelihood Estimator is Optimal in High Dimensions
We study the problem of learning a d-dimensional log-concave distribution from n i.i.d. samples with respect to both the squared Hellinger and the total variation distances. We show that for all d > 4 the maximum likelihood estimator achieves an optimal risk (up to a logarithmic factor) of O_d(n^-2/(d+1)(n)) in terms of squared Hellinger distance. Previously, the optimality of the MLE was known only for d< 3. Additionally, we show that the metric plays a key role, by proving that the minimax risk is at least Ω_d(n^-2/(d+4)) in terms of the total variation. Finally, we significantly improve the dimensional constant in the best known lower bound on the risk with respect to the squared Hellinger distance, improving the bound from 2^-dn^-2/(d+1) to Ω(n^-2/(d+1)). This implies that estimating a log-concave density up to a fixed accuracy requires a number of samples which is exponential in the dimension.
READ FULL TEXT