Multi-point dimensionality reduction to improve projection layout reliability
In ordinary Dimensionality Reduction (DR), each data instance in an m-dimensional space (original space) is mapped to one point in a d-dimensional space (visual space), preserving as much as possible distance and/or neighborhood relationships. Despite their popularity, even for simple datasets, the existing DR techniques unavoidably may produce misleading visual representations. The problem is not with the existing solutions but with problem formulation. For two dimensional visual space, if data instances are not co-planar or do not lie on a 2D manifold, there is no solution for the problem, and the possible approximations usually result in layouts with inaccuracies in the distance preservation and overlapped neighborhoods. In this paper, we elaborate on the concept of Multi-point Dimensionality Reduction where each data instance can be mapped to possibly more than one point in the visual space by providing the first general solution to it as a step toward mitigating this issue. By duplicating points, background information is added to the visual representation making local neighborhoods in the visual space more faithful to the original space. Our solution, named Red Gray Plus, is built upon and extends a combination of ordinary DR and graph drawing techniques. We show that not only Multi-point Dimensionality Reduction can be one of the potential directions to improve DR layouts' reliability but also that our initial solution to the problem outperforms popular ordinary DR methods quantitatively.
READ FULL TEXT