Low-latency multi-threaded ensemble learning for dynamic big data streams

Inicio | ¿Qué es? | Contacto

English | Català

Consultar RECERCAT

Por comunidades y
colecciones Por fecha Por autores Por títulos Por temas (CDU)

Consultar departamento

Por fecha Por autores Por títulos Por temas (CDU)

Estadisticas

Del documento Todo RECERCAT

Mi RECERCAT

Entrar Alertas por correo-e

Directorio de otros repositorios

RECERCAT Principal > Universitat Politècnica de Catalunya > Documents de recerca > Visualizar documento

Para acceder a los documentos con el texto completo, por favor, siga el siguiente enlace: http://hdl.handle.net/2117/120515

Título:	Low-latency multi-threaded ensemble learning for dynamic big data streams
Autor/a:	Marron, Diego; Ayguadé Parra, Eduard; Herrero Zaragoza, José Ramón; Read, Jesse; Bifet Figuerol, Albert Carles
Otros autores:	Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors; Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
Abstract:	© 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Abstract:	Real–time mining of evolving data streams involves new challenges when targeting today’s application domains such as the Internet of the Things: increasing volume, velocity and volatility requires data to be processed on–the–fly with fast reaction and adaptation to changes. This paper presents a high performance scalable design for decision trees and ensemble combinations that makes use of the vector SIMD and multicore capabilities available in modern processors to provide the required throughput and accuracy. The proposed design offers very low latency and good scalability with the number of cores on commodity hardware when compared to other state–of–the art implementations. On an Intel i7-based system, processing a single decision tree is 6x faster than MOA (Java), and 7x faster than StreamDM (C++), two well- known reference implementations. On the same system, the use of the 6 cores (and 12 hardware threads) available allow to process an ensemble of 100 learners 85x faster that MOA while providing the same accuracy. Furthermore, our solution is highly scalable: on an Intel Xeon socket with large core counts, the proposed ensemble design achieves up to 16x speed-up when employing 24 cores with respect to a single threaded execution.
Abstract:	This work is partially supported by the Spanish Government through Programa Severo Ochoa (SEV-2015-0493), by the Spanish Ministry of Science and Technology through TIN2015-65316-P project, by the Generalitat de Catalunya (contract 2014-SGR-1051), by the Universitat Politècnica de Catalunya through an FPI/UPC scholarship and by NVIDIA through the UPC/BSC GPU Center of Excellence.
Abstract:	Peer Reviewed
Materia(s):	-Àrees temàtiques de la UPC::Informàtica::Sistemes d'informació::Emmagatzematge i recuperació de la informació -Big data -Data streams -Random forests -Hoeffding tree -Low-latency -High performance -Dades massives
Derechos:
Tipo de documento:	Artículo - Versión presentada Objeto de conferencia
Editor:	Institute of Electrical and Electronics Engineers (IEEE)
Compartir:

Mostrar el registro completo del ítem

Documentos relacionados

Otros documentos del mismo autor/a

Echo state hoeffding tree learning

Marrón Vida, Diego; Read, Jesse; Bifet Figuerol, Albert Carles; Abdessalem, Talel; Ayguadé Parra, Eduard; Herrero Zaragoza, José Ramón

Data stream classification using random feature functions and novel method combinations

Marrón Vida, Diego; Read, Jesse; Bifet Figuerol, Albert Carles; Navarro, Nacho

Tareador: a tool to unveil parallelization strategies at undergraduate level

Ayguadé Parra, Eduard; Badia Sala, Rosa Maria; Jiménez González, Daniel; Herrero Zaragoza, José Ramón; Labarta Mancho, Jesús José; Subotic, Vladimir; Utrera Iglesias, Gladys Miriam

Mining frequent closed graphs on evolving data streams.

Bifet Figuerol, Albert Carles; Holmes, Geoff; Pfahringer, Bernhard; Gavaldà Mestre, Ricard

Detecting sentiment change in twitter streaming data

Bifet Figuerol, Albert Carles; Holmes, Geoffrey; Pfahringer, Bernhard; Gavaldà Mestre, Ricard

Accesibilidad | Aviso legal | Política de Cookies | Documentos de uso interno

Coordinación

Patrocinio