LLM: Realizing Low-Latency Memory by Exploiting Embedded Silicon Photonics for Irregular Workloads

Marjan Fariborz, Mahyar Samani, Pouya Fotouhi, Roberto Proietti, Il-min yi, Venkatesh Akella, Jason Lowe-Power, Samuel Palermo, S. J. Ben Yoo ISC-HPC 2022.

Paper on ACM DL Local Download Presentation Download


As emerging workloads exhibit irregular memory access patterns with poor data reuse and locality, they would benefit from a DRAM that achieves low latency without sacrificing bandwidth and energy efficiency. We propose LLM (Low Latency Memory), a codesign of the DRAM microarchitecture, the memory controller and the LLC/DRAM interconnect by leveraging embedded silicon photonics in 2.5D/3D integrated system on chip. LLM relies on Wavelength Division Multiplexing (WDM)-based photonic interconnects to reduce the contention throughout the memory subsystem. LLM also increases the bank-level parallelism, eliminates bus conflicts by using dedicated optical data paths, and reduces the access energy per bit with shorter global bitlines and smaller row buffers. We evaluate the design space of LLM for a variety of synthetic benchmarks and representative graph workloads on a full-system simulator (gem5). LLM exhibits low memory access latency for traffics with both regular and irregular access patterns. For irregular traffic, LLM achieves high bandwidth utilization (over 80% peak throughput compared to 20% of HBM2.0). For real workloads, LLM achieves 3× and 1.8× lower execution time compared to HBM2.0 and a state-of-the-art memory system with high memory level parallelism, respectively. This study also demonstrates that by reducing queuing on the data path, LLM can achieve on average 3.4× lower memory latency variation compared to HBM2.0.

author = {Fariborz, Marjan and Samani, Mahyar and Fotouhi, Pouya and Proietti, Roberto and Yi, Il-Min and Akella, Venkatesh and Lowe-Power, Jason and Palermo, Samuel and Yoo, S. J. Ben},
title = {LLM: Realizing Low-Latency Memory By Exploiting Embedded Silicon Photonics For Irregular Workloads},
year = {2022},
isbn = {978-3-031-07311-3},
publisher = {Springer-Verlag},
address = {Berlin, Heidelberg},
url = {https://doi.org/10.1007/978-3-031-07312-0_3},
doi = {10.1007/978-3-031-07312-0_3},
booktitle = {High Performance Computing: 37th International Conference, ISC High Performance 2022, Hamburg, Germany, May 29 – June 2, 2022, Proceedings},
pages = {44–64},
numpages = {21},
location = {Hamburg, Germany}