Universitat Politècnica de Catalunya. Doctorat en Teoria del Senyal i Comunicacions
Universitat Politècnica de Catalunya. Departament de Ciències de la Computació
Universitat Politècnica de Catalunya. ViRVIG - Grup de Recerca en Visualització, Realitat Virtual i Interacció Gràfica
2023-09-04
In data science and visualization, dimensionality reduction techniques have been extensively employed for exploring large datasets. These techniques involve the transformation of high-dimensional data into reduced versions, typically in 2D, with the aim of preserving significant properties from the original data. Many dimensionality reduction algorithms exist, and nonlinear approaches such as the t-SNE (t-Distributed Stochastic Neighbor Embedding) and UMAP (Uniform Manifold Approximation and Projection) have gained popularity in the field of information visualization. In this paper, we introduce a simple yet powerful manipulation for vector datasets that modifies their values based on weight frequencies. This technique significantly improves the results of the dimensionality reduction algorithms across various scenarios. To demonstrate the efficacy of our methodology, we conduct an analysis on a collection of well-known labeled datasets. The results demonstrate improved clustering performance when attempting to classify the data in the reduced space. Our proposal presents a comprehensive and adaptable approach to enhance the outcomes of dimensionality reduction for visual data exploration.
This research was funded by PID2021-122136OB-C21 from the Ministerio de Ciencia e Innovación, Spain, by 839 FEDER (EU) funds.
Peer Reviewed
Postprint (published version)
Article
English
Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial; Àrees temàtiques de la UPC::Informàtica::Sistemes d'informació; Data sets; Information visualization; Dimensionality reduction; Data visualization; Document embeddings; Conjunts de dades; Visualització de la informació
Multidisciplinary Digital Publishing Institute
https://www.mdpi.com/2076-3417/13/17/9967
info:eu-repo/grantAgreement/AEI/ PLAN ESTATAL DE INVESTIGACIÓN CIENTÍFICA Y TÉCNICA Y DE INNOVACIÓN 2017-2020/PID2021-122136OB-C21/ES/Entornos 3D de alta fidelidad para Realidad Virtual y Computación Visual: geometría, movimiento, interacción y visualización para salud, arquitectura y ciudades/
http://creativecommons.org/licenses/by/4.0/
Open Access
Attribution 4.0 International
E-prints [72986]