Topological mixture estimation
Density functions that represent sample data are often multimodal, i.e. they exhibit more than one maximum. Typically this behavior is taken to indicate that the underlying data deserves a more detailed representation as a mixture of densities with individually simpler structure. The usual specification of a component density is quite restrictive, with log-concave the most general case considered in the literature, and Gaussian the overwhelmingly typical case. It is also necessary to determine the number of mixture components a priori, and much art is devoted to this. Here, we introduce topological mixture estimation, a completely nonparametric and computationally efficient solution to the one-dimensional problem where mixture components need only be unimodal. We repeatedly perturb the unimodal decomposition of Baryshnikov and Ghrist to produce a topologically and information-theoretically optimal unimodal mixture. We also detail a smoothing process that optimally exploits topological persistence of the unimodal category in a natural way when working directly with sample data. Finally, we illustrate these techniques through examples.
READ FULL TEXT