Universitat Politècnica de Catalunya. Doctorat en Arquitectura de Computadors
Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
Universitat Politècnica de Catalunya. ARCO - Microarquitectura i Compiladors
2023
The constant growth of DNNs makes them challenging to implement and run efficiently on traditional computecentric architectures. Some works have attempted to enhance accelerators by adding more compute units and on-chip buffers, but they often worsen the memory issue due to increased bandwidth demands. Memory-centric designs based on Near-Data Processing (NDP) have been proposed to mitigate this problem by moving computations closer to the memory hierarchy. Leveraging 3D-stacked memory for its storage density and near-memory processing capabilities, this paper introduces QeiHaN, a hardware accelerator that optimizes DNN inference efficiency. QeiHaN employs a 3D-stacked memory-centric weight storage scheme combined with a logarithmic quantization of activations, resulting in reduced memory accesses by 25%. Evaluation demonstrates significant speedup and energy savings compared to a Neurocube-like accelerator across various DNNs.
QeiHaN has been supported by the CoCoUnit ERC Advanced Grant of EU’s Horizon 2020 (grant No 833057), Spanish State Research Agency (MCIN/AEI) under grant PID2020-113172RB-I00, and ICREA Academia program.
Peer Reviewed
Postprint (author's final draft)
Conference lecture
English
Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors; Memory management (Computer science); Energy consumption; DNN; NDP; Quantization; Exponential; Gestió de memòria (Informàtica); Energia -- Consum
Institute of Electrical and Electronics Engineers (IEEE)
https://ieeexplore.ieee.org/document/10364594
info:eu-repo/grantAgreement/EC/H2020/833057/EU/CoCoUnit: An Energy-Efficient Processing Unit for Cognitive Computing/CoCoUnit
info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2020-113172RB-I00/ES/ARQUITECTURAS DE DOMINIO ESPECIFICO PARA SISTEMAS DE COMPUTACION ENERGETICAMENTE EFICIENTES/
Open Access
E-prints [73051]