Inductive Mutual Information Estimation: A Convex Maximum-Entropy Copula Approach
We propose a novel estimator of the mutual information between two ordinal vectors x and y. Our approach is inductive (as opposed to deductive) in that it depends on the data generating distribution solely through some nonparametric properties revealing associations in the data, and does not require having enough data to fully characterize the true joint distributions P_x, y. Specifically, our approach consists of (i) noting that I(y; x) = I(u_y; u_x) where u_y and u_x are the copula-uniform dual representations of y and x (i.e. their images under the probability integral transform), and (ii) estimating the copula entropies h(u_y), h(u_x) and h(u_y, u_x) by solving a maximum-entropy problem over the space of copula densities under a constraint of the type α_m = E[ϕ_m(u_y, u_x)]. We prove that, so long as the constraint is feasible, this problem admits a unique solution, it is in the exponential family, and it can be learned by solving a convex optimization problem. The resulting estimator, which we denote MIND, is marginal-invariant, always non-negative, unbounded for any sample size n, consistent, has MSE rate O(1/n), and is more data-efficient than competing approaches. Beyond mutual information estimation, we illustrate that our approach may be used to mitigate mode collapse in GANs by maximizing the entropy of the copula of fake samples, a model we refer to as Copula Entropy Regularized GAN (CER-GAN).
READ FULL TEXT