k-Means is a Variational EM Approximation of Gaussian Mixture Models

04/16/2017
by   Jörg Lücke, et al.
0

We show that k-means (Lloyd's algorithm) is equivalent to a variational EM approximation of a Gaussian Mixture Model (GMM) with isotropic Gaussians. The k-means algorithm is obtained if truncated posteriors are used as variational distributions. In contrast to the standard way to relate k-means and GMMs, we show that it is not required to consider the limit case of Gaussians with zero variance. There are a number of consequences following from our observation: (A) k-means can be shown to monotonously increase the free-energy associated with truncated distributions; (B) Using the free-energy, we can derive an explicit and compact formula of a lower GMM likelihood bound which uses the k-means objective as argument; (C) We can generalize k-means using truncated variational EM, and relate such generalizations to other k-means-like algorithms. In general, truncated variational EM provides a natural and quantitative link between k-means-like clustering and GMM clustering algorithms which may be very relevant for future theoretical as well as empirical studies.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset