Notas:
|
The exploitation of throughput in a parallel application that processes an input data stream is a difficult
challenge. For typical coarse-grain applications, where the computation time of tasks is greater than
their communication time, the maximum achievable throughput is determined by the maximum task
computation time. Thus, the improvement in throughput above this maximum would eventually require
the modification of the source code of the tasks. In this work, we address the improvement of throughput
by proposing two task replication methodologies that have the target throughput to be achieved as
an input parameter. They proceed by generating a new task graph structure that permits the target
throughput to be achieved. The first replication mechanism, named DPRM (Data Parallel Replication
Mechanism), exploits the inner task data parallelism. The second mechanism, named TCRM (Task Copy
Replication Mechanism), creates new execution paths inside the application task graph structure that
allows more than one instance of data to be processed concurrently. We evaluate the effectiveness of
these mechanisms with three real applications executed in a cluster system: the MPEG2 video compressor,
the IVUS (Intra-Vascular Ultra-Sound) medical image application and the BASIZ (Bright and SAtured
Images Zone) video processing application. In all these cases, the obtained throughput was greater after
applying the proposed replication mechanism than what the application could provide with the original
implementation.
This work was supported by the Ministry of Education and Science (Spain) under contract TIN2011-28689-C02-02. |