LazyPIM: Efficient Support for Cache Coherence in Processing-in-Memory Architectures

by   Amirali Boroumand, et al.

Processing-in-memory (PIM) architectures have seen an increase in popularity recently, as the high internal bandwidth available within 3D-stacked memory provides greater incentive to move some computation into the logic layer of the memory. To maintain program correctness, the portions of a program that are executed in memory must remain coherent with the portions of the program that continue to execute within the processor. Unfortunately, PIM architectures cannot use traditional approaches to cache coherence due to the high off-chip traffic consumed by coherence messages, which, as we illustrate in this work, can undo the benefits of PIM execution for many data-intensive applications. We propose LazyPIM, a new hardware cache coherence mechanism designed specifically for PIM. Prior approaches for coherence in PIM are ill-suited to applications that share a large amount of data between the processor and the PIM logic. LazyPIM uses a combination of speculative cache coherence and compressed coherence signatures to greatly reduce the overhead of keeping PIM coherent with the processor, even when a large amount of sharing exists.We find that LazyPIM improves average performance across a range of data-intensive PIM applications by 19.6 consumption by 18.0


page 1

page 2

page 3

page 5

page 9

page 10

page 11


Phase-Priority based Directory Coherence for Multicore Processor

As the number of cores in a single chip increases, a typical implementat...

Enabling the Adoption of Processing-in-Memory: Challenges, Mechanisms, Future Research Directions

Poor DRAM technology scaling over the course of many years has caused DR...

Hardware Trojan Threats to Cache Coherence in Modern 2.5D Chiplet Systems

As industry moves toward chiplet-based designs, the insertion of hardwar...

Rainbow: A Composable Coherence Protocol for Multi-Chip Servers

The use of multi-chip modules (MCM) and/or multi-socket boards is the mo...

The BlackParrot BedRock Cache Coherence System

This paper presents BP-BedRock, the open-source cache coherence protocol...

Coherence Traffic in Manycore Processors with Opaque Distributed Directories

Manycore processors feature a high number of general-purpose cores desig...

Efficient Logging in Non-Volatile Memory by Exploiting Coherency Protocols

Non-volatile memory (NVM) technologies such as PCM, ReRAM and STT-RAM al...

Please sign up or login with your details

Forgot password? Click here to reset