Work-efficient parallel non-maximum suppression for embedded GPU architectures

Oro Garcia, David; Fernandez Tena, Carles; Martorell Bofill, Xavier; Hernando Pericás, Francisco Javier; Oro Garcia, David; Fernandez Tena, Carles; Martorell Bofill, Xavier; Hernando Pericás, Francisco Javier

Work-efficient parallel non-maximum suppression for embedded GPU architectures

Autor/a

Oro Garcia, David

Fernandez Tena, Carles

Martorell Bofill, Xavier

Hernando Pericás, Francisco Javier

Altres autors/es

Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors

Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions

Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions

Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla

Data de publicació

2016

Resum

With the emergence of GPU computing, deep neural networks have become a widely used technique for advancing research in the field of image and speech processing. In the context of object and event detection, slidingwindow classifiers require to choose the best among all positively discriminated candidate windows. In this paper, we introduce the first GPU-based non-maximum suppression (NMS) algorithm for embedded GPU architectures. The obtained results show that the proposed parallel algorithm reduces the NMS latency by a wide margin when compared to CPUs, even clocking the GPU at 50% of its maximum frequency on an NVIDIA Tegra K1. In this paper, we show results for object detection in images. The proposed technique is directly applicable to speech segmentation tasks such as speaker diarization.

Peer Reviewed

Postprint (published version)

Tipus de document

Conference report

Llengua

Anglès

Publicat per

Institute of Electrical and Electronics Engineers (IEEE)

Documents relacionats

http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7471831

info:eu-repo/grantAgreement/EC/H2020/644312/EU/Heterogeneous Secure Multi-level Remote Acceleration Service for Low-Power Integrated Systems and Devices/RAPID

Citació recomanada

Aquesta citació s'ha generat automàticament.

Exportar

DIDL MARC MARC_CCUC METS OAI_DC ORE QDC RDF

Drets

Restricted access - publisher's policy

Aquest element apareix en la col·lecció o col·leccions següent(s)

E-prints [72986]

Work-efficient parallel non-maximum suppression for embedded GPU architectures

Autor/a

Altres autors/es

Data de publicació

Compartir

Resum

Tipus de document

Llengua

Matèries i paraules clau

Publicat per

Documents relacionats

Citació recomanada

Exportar

Drets

Aquest element apareix en la col·lecció o col·leccions següent(s)