To access the full text documents, please follow this link: http://hdl.handle.net/2117/173031

A hardware runtime for task-based programming models
Tan, Xubin; Bosch, Jaume; Álvarez, Carlos; Jiménez González, Daniel; Ayguadé Parra, Eduard; Valero Cortés, Mateo
Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors; Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Task-based programming models such as OpenMP 5.0 and OmpSs are simple to use and powerful enough to exploit task parallelism of applications over multicore, manycore and heterogeneous systems. However, their software-only runtimes introduce relevant overhead when targeting fine-grained tasks, resulting in performance losses. To overcome this drawback, we present a hardware runtime Picos++ that accelerates critical runtime functions such as task dependence analysis, nested task support, and heterogeneous task scheduling. As a proof-of-concept, the Picos++ hardware runtime has been integrated with a compiler infrastructure that supports parallel task-based programming models. A FPGA SoC running Linux OS has been used to implement the hardware accelerated part of Picos++, integrated with a heterogeneous system composed of 4 symmetric multiprocessor (SMP) cores and several hardware functional accelerators (HwAccs) for task execution. Results show significant improvements on energy and performance compared to state-of-the-art parallel software-only runtimes. With Picos++, applications can achieve up to 7.6x speedup and save up to 90 percent of energy, when using 4 threads and up to 4 HwAccs, and even reach a speedup of 16x over the software alternative when using 12 HwAccs and small tasks.
Peer Reviewed
-Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors::Arquitectures paral·leles
-Field programmable gate arrays
-Multiprocessors
-Parallel processing (Electronic computers)
-Fine-grained parallelism
-Task-dependence analysis
-Nested tasks
-Heterogeneous task scheduling
-Energy saving
-FPGA
-Task-based programming models
-Matrius de portes programables per l'usuari
-Multiprocessadors
-Processament en paral·lel (Ordinadors)
Article - Submitted version
Article
         

Show full item record

Related documents

Other documents of the same author

Tan, Xubin; Bosch, Jaume; Vidal-Piñol, Miquel; Álvarez, Carlos; Jiménez-González, Daniel; Ayguadé Parra, Eduard; Valero Cortés, Mateo
Bosch, Jaume; Tan, Xubin; Filgueras Izquierdo, Antonio; Vidal, Miquel; Mateu, Marc; Jiménez-González, Daniel; Álvarez, Carlos; Martorell Bofill, Xavier; Ayguadé Parra, Eduard; Labarta Mancho, Jesús José
Bosch, Jaume; Tan, Xubin; Álvarez Martínez, Carlos; Jiménez González, Daniel; Martorell Bofill, Xavier; Ayguadé Parra, Eduard
Tan, Xubin; Bosch, Jaume; Vidal-Piñol, Miquel; Alvarez, Carlos; Jimenez-Gonzalez, Daniel; Ayguadé Parra, Eduard; Valero Cortés, Mateo
Tan, Xubin; Bosch Pons, Jaume; Jiménez González, Daniel; Álvarez Martínez, Carlos; Ayguadé Parra, Eduard; Valero Cortés, Mateo
 

Coordination

 

Supporters