QeiHaN: An energy-efficient DNN accelerator that leverages log quantization in NDP architectures

Khabbazan, Bahareh; Riera Villanueva, Marc; González Colás, Antonio María

QeiHaN: An energy-efficient DNN accelerator that leverages log quantization in NDP architectures

dc.contributor

Universitat Politècnica de Catalunya. Doctorat en Arquitectura de Computadors

dc.contributor

Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors

dc.contributor

Universitat Politècnica de Catalunya. ARCO - Microarquitectura i Compiladors

dc.contributor.author

Khabbazan, Bahareh

dc.contributor.author

Riera Villanueva, Marc

dc.contributor.author

González Colás, Antonio María

dc.date.issued

2023

dc.identifier

Khabbazan, B.; Riera, M.; Gonzalez, A. QeiHaN: An energy-efficient DNN accelerator that leverages log quantization in NDP architectures. A: International Conference on Parallel Architectures and Compilation Techniques. "2023 32nd International Conference on Parallel Architecture and Compilation Techniques, PACT 2023: Vienna, Austria, 21-25 October 2023: proceedings". Institute of Electrical and Electronics Engineers (IEEE), 2023, p. 325-326. ISBN 979-8-3503-4254-3. DOI 10.1109/PACT58117.2023.00036.

dc.identifier

979-8-3503-4254-3

dc.identifier

https://hdl.handle.net/2117/403916

dc.identifier

10.1109/PACT58117.2023.00036

dc.description.abstract

The constant growth of DNNs makes them challenging to implement and run efficiently on traditional computecentric architectures. Some works have attempted to enhance accelerators by adding more compute units and on-chip buffers, but they often worsen the memory issue due to increased bandwidth demands. Memory-centric designs based on Near-Data Processing (NDP) have been proposed to mitigate this problem by moving computations closer to the memory hierarchy. Leveraging 3D-stacked memory for its storage density and near-memory processing capabilities, this paper introduces QeiHaN, a hardware accelerator that optimizes DNN inference efficiency. QeiHaN employs a 3D-stacked memory-centric weight storage scheme combined with a logarithmic quantization of activations, resulting in reduced memory accesses by 25%. Evaluation demonstrates significant speedup and energy savings compared to a Neurocube-like accelerator across various DNNs.

dc.description.abstract

QeiHaN has been supported by the CoCoUnit ERC Advanced Grant of EU’s Horizon 2020 (grant No 833057), Spanish State Research Agency (MCIN/AEI) under grant PID2020-113172RB-I00, and ICREA Academia program.

dc.description.abstract

Peer Reviewed

dc.description.abstract

Postprint (author's final draft)

dc.format

2 p.

dc.format

application/pdf

dc.language

eng

dc.publisher

Institute of Electrical and Electronics Engineers (IEEE)

dc.relation

https://ieeexplore.ieee.org/document/10364594

dc.relation

info:eu-repo/grantAgreement/EC/H2020/833057/EU/CoCoUnit: An Energy-Efficient Processing Unit for Cognitive Computing/CoCoUnit

dc.relation

info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2020-113172RB-I00/ES/ARQUITECTURAS DE DOMINIO ESPECIFICO PARA SISTEMAS DE COMPUTACION ENERGETICAMENTE EFICIENTES/

dc.rights

Open Access

dc.subject

Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors

dc.subject

Memory management (Computer science)

dc.subject

Energy consumption

dc.subject

DNN

dc.subject

NDP

dc.subject

Quantization

dc.subject

Exponential

dc.subject

Gestió de memòria (Informàtica)

dc.subject

Energia -- Consum

dc.title

QeiHaN: An energy-efficient DNN accelerator that leverages log quantization in NDP architectures

dc.type

Conference lecture

Ficheros en el ítem

Ficheros	Tamaño	Formato	Ver
No hay ficheros asociados a este ítem.

Este ítem aparece en la(s) siguiente(s) colección(ones)

E-prints [73026]