Title:
|
Apache Mahout’s k-Means vs. fuzzy k-Means performance evaluation
|
Author:
|
Xhafa Xhafa, Fatos; Bogza, Adriana; Caballé Llobet, Santiago; Barolli, Leonard
|
Other authors:
|
Universitat Politècnica de Catalunya. Departament de Ciències de la Computació |
Abstract:
|
(c) 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. |
Abstract:
|
The emergence of the Big Data as a disruptive technology for next generation of intelligent systems, has brought many issues of how to extract and make use of the knowledge obtained from the data within short times, limited budget and under high rates of data generation. The foremost challenge identified here is the data processing, and especially, mining and analysis for knowledge extraction. As the 'old' data mining frameworks were designed without Big Data requirements, a new generation of such frameworks is being developed fully implemented in Cloud platforms. One such frameworks is Apache Mahout aimed to leverage fast processing and analysis of Big Data. The performance of such new data mining frameworks is yet to be evaluated and potential limitations are to be revealed. In this paper we analyse the performance of Apache Mahout using large real data sets from the Twitter stream. We exemplify the analysis for the case of two clustering algorithms, namely, k-Means and Fuzzy k-Means, using a Hadoop cluster infrastructure for the experimental study. |
Abstract:
|
Peer Reviewed |
Subject(s):
|
-Àrees temàtiques de la UPC::Informàtica::Sistemes d'informació -Data mining -Big data -Computer algorithms -Hadoop cluster -Data mining algorithms -Apache mahout -Big data -K-means -Fuzzy k-means -Performance -Mineria de dades -Macrodades -Algorismes computacionals |
Rights:
|
|
Document type:
|
Article - Submitted version Conference Object |
Published by:
|
Institute of Electrical and Electronics Engineers (IEEE)
|
Share:
|
|