Curvature of point clouds through principal component analysis
In this article, we study curvature-like feature value of data sets in Euclidean spaces. First we formulate such curvature functions with desirable properties under the manifold hypothesis. Then we make a test property for the validity of the curvature function by the law of large numbers, and check it for the function we construct by numerical experiments. These experiments also suggest us to conjecture that mean of the curvature of sample manifolds coincides with the curvature of the mean manifold. Our construction is based on the dimension estimation by the principal component analysis and the Gaussian curvature of hypersurfaces. Our function depends on provisional parameters ε, δ, and we suggest to deal with the resulting functions as a function of these parameters to get some robustness. As an application, we propose a method to decompose data sets into some parts reflecting local structure. For this, we embed the data sets into higher dimensional Euclidean space by using curvature values and cluster them in the embedded space. We also give some computational experiments that support effectiveness of our methods.
READ FULL TEXT