TY - JOUR
T1 - Siracusa
T2 - A 16 nm Heterogenous RISC-V SoC for Extended Reality With At-MRAM Neural Engine
AU - Prasad, Arpan Suravi
AU - Scherer, Moritz
AU - Conti, Francesco
AU - Rossi, Davide
AU - Di Mauro, Alfio
AU - Eggimann, Manuel
AU - Gomez, Jorge Tomas
AU - Li, Ziyun
AU - Sarwar, Syed Shakib
AU - Wang, Zhao
AU - De Salvo, Barbara
AU - Benini, Luca
N1 - Publisher Copyright:
© 1966-2012 IEEE.
PY - 2024/7/1
Y1 - 2024/7/1
N2 - Extended reality (XR) applications are machine learning (ML)-intensive, featuring deep neural networks (DNNs) with millions of weights, tightly latency-bound (10-20 ms end-to-end), and power-constrained (low tens of mW average power). While ML performance and efficiency can be achieved by introducing neural engines within low-power systems-on-chip (SoCs), system-level power for nontrivial DNNs depends strongly on the energy of non-volatile memory (NVM) access for network weights. This work introduces Siracusa, a near-sensor heterogeneous SoC for next-generation XR devices manufactured in 16 nm CMOS. Siracusa couples an octa-core cluster of RISC-V digital signal processing (DSP) cores with a novel tightly coupled 'At-Memory' integration between a state-of-the-art digital neural engine called N-EUREKA and an on-chip NVM based on magnetoresistive random access memory (MRAM), achieving 1.7× higher throughput and 3× better energy efficiency than XR SoCs using NVM as background memory. The fabricated SoC prototype achieves an area efficiency of 65.2 GOp/s/mm2 and a peak energy efficiency of 8.84 TOp/J for DNN inference while supporting complex, heterogeneous application workloads, which combine ML with conventional signal processing and control.
AB - Extended reality (XR) applications are machine learning (ML)-intensive, featuring deep neural networks (DNNs) with millions of weights, tightly latency-bound (10-20 ms end-to-end), and power-constrained (low tens of mW average power). While ML performance and efficiency can be achieved by introducing neural engines within low-power systems-on-chip (SoCs), system-level power for nontrivial DNNs depends strongly on the energy of non-volatile memory (NVM) access for network weights. This work introduces Siracusa, a near-sensor heterogeneous SoC for next-generation XR devices manufactured in 16 nm CMOS. Siracusa couples an octa-core cluster of RISC-V digital signal processing (DSP) cores with a novel tightly coupled 'At-Memory' integration between a state-of-the-art digital neural engine called N-EUREKA and an on-chip NVM based on magnetoresistive random access memory (MRAM), achieving 1.7× higher throughput and 3× better energy efficiency than XR SoCs using NVM as background memory. The fabricated SoC prototype achieves an area efficiency of 65.2 GOp/s/mm2 and a peak energy efficiency of 8.84 TOp/J for DNN inference while supporting complex, heterogeneous application workloads, which combine ML with conventional signal processing and control.
KW - Artificial intelligence (AI)
KW - augmented reality (AR)
KW - deep neural network (DNN)
KW - extended reality (XR)
KW - heterogeneous architecture
KW - magnetoresistive random access memory (MRAM)
KW - non-volatile memory (NVM)
KW - RISC-V
KW - system-on-chip (SoC)
UR - http://www.scopus.com/inward/record.url?scp=85194054109&partnerID=8YFLogxK
U2 - 10.1109/JSSC.2024.3385987
DO - 10.1109/JSSC.2024.3385987
M3 - Article
AN - SCOPUS:85194054109
SN - 0018-9200
VL - 59
SP - 2055
EP - 2069
JO - IEEE Journal of Solid-State Circuits
JF - IEEE Journal of Solid-State Circuits
IS - 7
ER -