Design Space Exploration of Algorithmic Multi-Port Memories in High-Performance Application-Specific Accelerators
Memory load/store instructions consume an important part in execution time and energy consumption in domain-specific accelerators. For designing highly parallel systems, available parallelism at each granularity is extracted from the workloads. The maximal use of parallelism at each granularity in these high-performance designs requires the utilization of multi-port memories. Currently, true multiport designs are less popular because there is no inherent EDA support for multiport memory beyond 2-ports, utilizing more ports requires circuit-level implementation and hence a high design time. In this work, we present a framework for Design Space Exploration of Algorithmic Multi-Port Memories (AMM) in ASICs. We study different AMM designs in the literature, discuss how we incorporate them in the Pre-RTL Aladdin Framework with different memory depth, port configurations and banking structures. From our analysis on selected applications from the MachSuite (accelerator benchmark suite), we understand and quantify the potential use of AMMs (as true multiport memories) for high performance in applications with low spatial locality in memory access patterns.
READ FULL TEXT