CuttleSys: Data-Driven Resource Management forInteractive Applications on Reconfigurable Multicores

08/01/2020
by   Neeraj Kulkarni, et al.
0

Multi-tenancy for latency-critical applications leads to re-source interference and unpredictable performance. Core reconfiguration opens up more opportunities for colocation,as it allows the hardware to adjust to the dynamic performance and power needs of a specific mix of co-scheduled applications. However, reconfigurability also introduces challenges, as even for a small number of reconfigurable cores, exploring the design space becomes more time- and resource-demanding. We present CuttleSys, a runtime for reconfigurable multi-cores that leverages scalable and lightweight data mining to quickly identify suitable core and cache configurations for a set of co-scheduled applications. The runtime combines collaborative filtering to infer the behavior of each job on every core and cache configuration, with Dynamically Dimensioned Search to efficiently explore the configuration space. We evaluate CuttleSys on multicores with tens of reconfigurable cores and show up to 2.46x and 1.55x performance improvements compared to core-level gating and oracle-like asymmetric multicores respectively, under stringent power constraints.

READ FULL TEXT

page 1

page 3

page 4

page 9

page 10

research
06/23/2017

HourGlass: Predictable Time-based Cache Coherence Protocol for Dual-Critical Multi-Core Systems

We present a hardware mechanism called HourGlass to predictably share da...
research
11/12/2019

Coordinated Management of DVFS and Cache Partitioning under QoS Constraints to Save Energy in Multi-Core Systems

Reducing the energy expended to carry out a computational task is import...
research
08/01/2021

Versa: A Dataflow-Centric Multiprocessor with 36 Systolic ARM Cortex-M4F Cores and a Reconfigurable Crossbar-Memory Hierarchy in 28nm

We present Versa, an energy-efficient processor with 36 systolic ARM Cor...
research
04/17/2023

Dynamically Reconfigurable Variable-precision Sparse-Dense Matrix Acceleration in Tensorflow Lite

In this paper, we present a dynamically reconfigurable hardware accelera...
research
03/20/2017

Formalizing Memory Accesses and Interrupts

The hardware/software boundary in modern heterogeneous multicore compute...
research
04/11/2023

Performance Study of Partitioned Caches in Asymmetric Multi-Core Processors

The current workloads and applications are highly diversified, facing cr...
research
07/09/2020

IOCA: High-Speed I/O-Aware LLC Management for Network-Centric Multi-Tenant Platform

In modern server CPUs, last-level cache (LLC) is a critical hardware res...

Please sign up or login with your details

Forgot password? Click here to reset