Read-Tuned STT-RAM and eDRAM Cache Hierarchies for Throughput and Energy Enhancement

by   Navid Khoshavi, et al.

As capacity and complexity of on-chip cache memory hierarchy increases, the service cost to the critical loads from Last Level Cache (LLC), which are frequently repeated, has become a major concern. The processor may stall for a considerable interval while waiting to access the data stored in the cache blocks in LLC, if there are no independent instructions to execute. To provide accelerated service to the critical loads requests from LLC, this work concentrates on leveraging the additional capacity offered by replacing SRAM-based L2 with Spin-Transfer Torque Random Access Memory (STT-RAM) to accommodate frequently accessed cache blocks in exclusive read mode in favor of reducing the overall read service time. Our proposed technique partitions L2 cache into two STT-RAM arrangements with different write performance and data retention time. The retention-relaxed STT-RAM arrays are utilized to effectively deal with the regular L2 cache requests while the high retention STT-RAM arrays in L2 are selected for maintaining repeatedly read accessed cache blocks from LLC by incurring negligible energy consumption for data retention. Our experimental results show that the proposed technique can reduce the mean L2 read miss ratio by 51.4 across PARSEC benchmark suite while significantly decreasing the total L2 energy consumption compared to conventional SRAM-based L2 design.


page 3

page 7

page 8

page 9


FUSE: Fusing STT-MRAM into GPUs to Alleviate Off-Chip Memory Access Overheads

In this work, we propose FUSE, a novel GPU cache system that integrates ...

Mitigating Read-disturbance Errors in STT-RAM Caches by Using Data Compression

Due to its high density and close-to-SRAM read latency, spin transfer to...

Analytical models of Energy and Throughput for Caches in MPSoCs

General trends in computer architecture are shifting more towards parall...

RF-Trojan: Leaking Kernel Data Using Register File Trojan

Register Files (RFs) are the most frequently accessed memories in a micr...

The Influence of Malloc Placement on TSX Hardware Transactional Memory

The hardware transactional memory (HTM) implementation in Intel's i7-477...

Mirrored and Hybrid Disk Arrays: Organization, Scheduling, Reliability, and Performance

Basic mirroring (BM) classified as RAID level 1 replicates data on two d...

High Throughput Push Based Storage Manager

The storage manager, as a key component of the database system, is respo...

Please sign up or login with your details

Forgot password? Click here to reset