BOPS, Not FLOPS! A New Metric, Measuring Tool, and Roofline Performance Model For Datacenter Computing

01/28/2018
by   Lei Wang, et al.
0

The past decades witness FLOPS (Floating-point Operations per Second), as an important computation-centric performance metric, guides computer architecture evolution, bridges hardware and software co-design, and provides quantitative performance number for system optimization. However, for emerging datacenter computing (in short, DC) workloads, such as internet services or big data analytics, previous work reports on the modern CPU architecture that the average proportion of floating-point instructions only takes 1 FLOPS efficiency is only 0.1 63 for evaluating DC computer systems. To address the above issue, we propose a new computation-centric metric BOPS (Basic OPerations per Second). In our definition, Basic Operations include all of arithmetic, logical, comparing and array addressing operations for integer and floating point. BOPS is the average number of BOPs (Basic OPerations) completed each second. To that end, we present a dwarf-based measuring tool to evaluate DC computer systems in terms of our new metrics. On the basis of BOPS, also we propose a new roofline performance model for DC computing. Through the experiments, we demonstrate that our new metrics--BOPS, measuring tool, and new performance model indeed facilitate DC computer system design and optimization.

READ FULL TEXT

page 12

page 14

research
01/28/2018

BOPS, Not FLOPS! A New Metric and Roofline Performance Model For Datacenter Computing

The past decades witness FLOPS (Floating-point Operations per Second) as...
research
09/09/2022

FLInt: Exploiting Floating Point Enabled Integer Arithmetic for Efficient Random Forest Inference

In many machine learning applications, e.g., tree-based ensembles, float...
research
10/29/2021

Design and implementation of an out-of-order execution engine of floating-point arithmetic operations

In this thesis, work is undertaken towards the design in hardware descri...
research
08/14/2020

Manticore: A 4096-core RISC-V Chiplet Architecture for Ultra-efficient Floating-point Computing

Data-parallel problems demand ever growing floating-point (FP) operation...
research
06/10/2021

NetFC: enabling accurate floating-point arithmetic on programmable switches

In-network computation has been widely used to accelerate data-intensive...
research
10/27/2016

Accelerating BLAS and LAPACK via Efficient Floating Point Architecture Design

Basic Linear Algebra Subprograms (BLAS) and Linear Algebra Package (LAPA...
research
09/20/2018

FFT Convolutions are Faster than Winograd on Modern CPUs, Here is Why

Winograd-based convolution has quickly gained traction as a preferred ap...

Please sign up or login with your details

Forgot password? Click here to reset