Heterogeneous system coherence for integrated CPU-GPU systems

Paper on ACM DL Local Download Presentation (PDF)


Many future heterogeneous systems will integrate CPUs and GPUs physically on a single chip and logically connect them via shared memory to avoid explicit data copying. Making this shared memory coherent facilitates programming and fine-grained sharing, but throughput-oriented GPUs can overwhelm CPUs with coherence requests not well-filtered by caches. Meanwhile, region coherence has been proposed for CPU-only systems to reduce snoop bandwidth by obtaining coherence permissions for large regions.

This paper develops Heterogeneous System Coherence (HSC) for CPU-GPU systems to mitigate the coherence bandwidth effects of GPU memory requests. HSC replaces a standard directory with a region directory and adds a region buffer to the L2 cache. These structures allow the system to move bandwidth from the coherence network to the high-bandwidth direct-access bus without sacrificing coherence.

Evaluation results with a subset of Rodinia benchmarks and the AMD APP SDK show that HSC can improve performance compared to a conventional directory protocol by an average of more than 2x and a maximum of more than 4.5x. Additionally, HSC reduces the bandwidth to the directory by an average of 94% and by more than 99% for four of the analyzed benchmarks.

Jason Power, Arkaprava Basu, Junli Gu, Sooraj Puthoor, Bradford M. Beckmann, Mark D. Hill, Steven K. Reinhardt, and David A. Wood. 2013. Heterogeneous system coherence for integrated CPU-GPU systems. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-46). ACM, New York, NY, USA, 457-467. DOI=http://dx.doi.org/10.1145/2540708.2540747

    author = {Power, Jason and Basu, Arkaprava and Gu, Junli and Puthoor, Sooraj and Beckmann, Bradford M. and Hill, Mark D. and Reinhardt, Steven K. and Wood, David A.},
    title = {Heterogeneous System Coherence for Integrated CPU-GPU Systems},
    booktitle = {Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture},
    series = {MICRO-46},
    year = {2013},
    isbn = {978-1-4503-2638-4},
    location = {Davis, California},
    pages = {457--467},
    numpages = {11},
    url = {http://doi.acm.org/10.1145/2540708.2540747},
    doi = {10.1145/2540708.2540747},
    acmid = {2540747},
    publisher = {ACM},
    address = {New York, NY, USA},