Abstract:
|
Manifold learningmethodsmodel high-dimensional data through low-dimensional manifolds embedded in the observed data space. This simplification implies that their are prone to trustworthiness and continuity errors. Generative Topographic Mapping (GTM) is one such manifold learning method for multivariate data clustering and visualization, defined within a probabilistic framework. In the original formulation,GTMis optimized byminimization of an error that is a function of Euclidean distances, making it vulnerable to the aforementioned errors, especially for datasets of convoluted geometry. Here, we modify GTM to penalize divergences between
theEuclidean distances fromthe datapoints to themodel prototypes and the corresponding geodesic distances along the manifold. Several experiments with artificial data showthat this strategy improves the continuity and trustworthiness of the data representation generated by the model. |