Heterogeneous Memory

Cross-layer rethinking of the memory hierarchy for heterogeneous systems

Due to technological changes such as the end of Moore’s Law, the end of Dennard Scaling, and an explosion in data, computing systems are becoming more heterogeneous. However, the underlying hardware and software for these systems were developed for homogeneous systems, and directly applying current techniques to heterogeneous system results in poor usability, poor performance, or both. This proposal specifically focuses on improving the software interfaces through new hardware mechanisms for heterogeneous memory systems. New memory technologies are proliferating (e.g., HBM, HMC, PCM, 3D-XPoint, ReRAM, and disaggregated memory), and these new memory technologies have differences in latency, bandwidth, cost, and other new design parameters such as persistence and asymmetric read and write times. Because of these changing technological constraints, future systems are likely to be deeply heterogeneous, combining many different memory technologies. Thus, it is time to redesign the memory hierarchy to accommodate heterogeneous memory systems.

Professor Jason Lowe-Power is currently looking for motivated students to work on these projects. Email Jason (jlowepower@ucdavis.edu) to set up a meeting to find out more information about these projects.

Modeling NUMA and DRAM caches in gem5

One of the initial projects to begin investigating heterogeneous memory is to develop software models of previous designs. This project will develop models and create experiments to test both NUMA systems and DRAM cache designs in gem5.

In this project, first we will implement NUMA support in the gem5 full-system simulator. The goal will be to correctly model NUMA systems like AMD’s EPYC with many different NUMA nodes. We will use gem5’s KVM virtual fast-forwarding to enable simulating large NUMA systems.

Second, we will implement a flexible DRAM cache implementation. We will then implement state of the art DRAM cache designs (e.g., BEAR) and a new DRAM cache design (adaptive victim). We will compare these DRAM cache designs in a full system simulator (gem5) and show that gem5 is a better model that simpler simulators. We will also compare these designs to the NUMA system described above to compare and contrast NUMA and cache designs.