Automated data pre-processing via meta-learning

Bilalli, Besim; Abelló Gamazo, Alberto; Aluja Banet, Tomàs; Wrembel, Robert; Bilalli, Besim; Abelló Gamazo, Alberto; Aluja Banet, Tomàs; Wrembel, Robert

Automated data pre-processing via meta-learning

Autor/a

Bilalli, Besim

Abelló Gamazo, Alberto

Aluja Banet, Tomàs

Wrembel, Robert

Altres autors/es

Universitat Politècnica de Catalunya. Departament d'Enginyeria de Serveis i Sistemes d'Informació

Universitat Politècnica de Catalunya. Departament d'Estadística i Investigació Operativa

Universitat Politècnica de Catalunya. MPI - Modelització i Processament de la Informació

Universitat Politècnica de Catalunya. LIAM - Laboratori de Modelització i Anàlisi de la Informació

Data de publicació

2016

Resum

The final publication is available at link.springer.com

A data mining algorithm may perform differently on datasets with different characteristics, e.g., it might perform better on a dataset with continuous attributes rather than with categorical attributes, or the other way around. As a matter of fact, a dataset usually needs to be pre-processed. Taking into account all the possible pre-processing operators, there exists a staggeringly large number of alternatives and nonexperienced users become overwhelmed. We show that this problem can be addressed by an automated approach, leveraging ideas from metalearning. Specifically, we consider a wide range of data pre-processing techniques and a set of data mining algorithms. For each data mining algorithm and selected dataset, we are able to predict the transformations that improve the result of the algorithm on the respective dataset. Our approach will help non-expert users to more effectively identify the transformations appropriate to their applications, and hence to achieve improved results.

Peer Reviewed

Postprint (published version)

Tipus de document

Conference report

Llengua

Anglès

Matèries i paraules clau

Àrees temàtiques de la UPC::Informàtica::Enginyeria del software; Data mining -- Statistical methods; Data handling; Automated approach; Automated data; Categorical attributes; Continuous attribute; Data mining algorithm; Data preprocessing; Expert users; Pre-processing; Mineria de dades -- Mètodes estadístics

Documents relacionats

http://link.springer.com/chapter/10.1007/978-3-319-45547-1_16

Citació recomanada

Aquesta citació s'ha generat automàticament.

Exportar

DIDL MARC MARC_CCUC METS OAI_DC ORE QDC RDF

Drets

Open Access

Aquest element apareix en la col·lecció o col·leccions següent(s)

E-prints [73034]

Automated data pre-processing via meta-learning

Autor/a

Altres autors/es

Data de publicació

Compartir

Resum

Tipus de document

Llengua

Matèries i paraules clau

Documents relacionats

Citació recomanada

Exportar

Drets

Aquest element apareix en la col·lecció o col·leccions següent(s)