dc.contributor |
Barcelona Supercomputing Center |
dc.contributor.author |
Villalba, Álvaro |
dc.contributor.author |
Carrera, David |
dc.date |
2018-12 |
dc.identifier.citation |
Villalba, Á.; Carrera, D. Multi-tenant Pub/Sub Processing for Real-Time Data Streams. A: "Euro-Par 2018: Euro-Par 2018: Parallel Processing Workshops". Springer, 2018, p. 251-526. |
dc.identifier.citation |
978-3-030-10548-8 |
dc.identifier.citation |
10.1007/978-3-030-10549-5_20 |
dc.identifier.uri |
http://hdl.handle.net/2117/129338 |
dc.language.iso |
eng |
dc.publisher |
Springer |
dc.relation |
https://link.springer.com/chapter/10.1007/978-3-030-10549-5_20 |
dc.relation |
info:eu-repo/grantAgreement/EC/H2020/639595/EU/Holistic Integration of Emerging Supercomputing Technologies/Hi-EST |
dc.relation |
info:eu-repo/grantAgreement/ES/PE2013-2016/TIN2015-65316-P |
dc.rights |
info:eu-repo/semantics/openAccess |
dc.subject |
Àrees temàtiques de la UPC::Informàtica |
dc.subject |
High performance computing |
dc.subject |
Big Data |
dc.subject |
Analytics |
dc.subject |
Stream Processing |
dc.subject |
Real-time Data Processing |
dc.subject |
Programming Models |
dc.subject |
Internet of Things |
dc.subject |
Supercomputadors |
dc.title |
Multi-tenant Pub/Sub Processing for Real-Time Data Streams |
dc.type |
info:eu-repo/semantics/submittedVersion |
dc.type |
info:eu-repo/semantics/conferenceObject |
dc.description.abstract |
Devices and sensors generate streams of data across a diversity of locations and protocols. That data usually reaches a central platform that is used to store and process the streams. Processing can be done in real time, with transformations and enrichment happening on-the-fly, but it can also happen after data is stored and organized in repositories. In the former case, stream processing technologies are required to operate on the data; in the latter batch analytics and queries are of common use.
This paper introduces a runtime to dynamically construct data stream processing topologies based on user-supplied code. These dynamic topologies are built on-the-fly using a data subscription model defined by the applications that consume data. Each user-defined processing unit is called a Service Object. Every Service Object consumes input data streams and may produce output streams that others can consume. The subscription-based programing model enables multiple users to deploy their own data-processing services. The runtime does the dynamic forwarding of data and execution of Service Objects from different users. Data streams can originate in real-world devices or they can be the outputs of Service Objects.
The runtime leverages Apache STORM for parallel data processing, that combined with dynamic user-code injection provides multi-tenant stream processing topologies. In this work we describe the runtime, its features and implementation details, as well as we include a performance evaluation of some of its core components. |
dc.description.abstract |
This work is partially supported by the European Research Council (ERC) un-
der the EU Horizon 2020 programme (GA 639595), the Spanish Ministry of
Economy, Industry and Competitivity (TIN2015-65316-P) and the Generalitat
de Catalunya (2014-SGR-1051). |
dc.description.abstract |
Peer Reviewed |