Performance Analysis of Traditional and Data-Parallel Primitive Implementations of Visualization and Analysis Kernels

10/05/2020
by   E. Wes Bethel, et al.
0

Measurements of absolute runtime are useful as a summary of performance when studying parallel visualization and analysis methods on computational platforms of increasing concurrency and complexity. We can obtain even more insights by measuring and examining more detailed measures from hardware performance counters, such as the number of instructions executed by an algorithm implemented in a particular way, the amount of data moved to/from memory, memory hierarchy utilization levels via cache hit/miss ratios, and so forth. This work focuses on performance analysis on modern multi-core platforms of three different visualization and analysis kernels that are implemented in different ways: one is "traditional", using combinations of C++ and VTK, and the other uses a data-parallel approach using VTK-m. Our performance study consists of measurement and reporting of several different hardware performance counters on two different multi-core CPU platforms. The results reveal interesting performance differences between these two different approaches for implementing these kernels, results that would not be apparent using runtime as the only metric.

READ FULL TEXT

page 1

page 6

page 8

page 9

research
12/23/2017

Protecting real-time GPU kernels on integrated CPU-GPU SoC platforms

Integrated CPU-GPU architecture provides excellent acceleration capabili...
research
08/01/2016

TREES: A CPU/GPU Task-Parallel Runtime with Explicit Epoch Synchronization

We have developed a task-parallel runtime system, called TREES, that is ...
research
01/27/2020

Automated Parallel Kernel Extraction from Dynamic Application Traces

Modern program runtime is dominated by segments of repeating code called...
research
09/17/2019

Leyenda: An Adaptive, Hybrid Sorting Algorithm for Large Scale Data with Limited Memory

Sorting is the one of the fundamental tasks of modern data management sy...
research
07/17/2023

Lightweight ML-based Runtime Prefetcher Selection on Many-core Platforms

Modern computer designs support composite prefetching, where multiple in...
research
03/08/2022

Quantifying Daily Evolution of Mobile Software Based on Memory Allocator Churn

The pace and volume of code churn necessary to evolve modern software sy...
research
01/29/2018

Using High-Speed WANs and Network Data Caches to Enable Remote and Distributed Visualization

Visapult is a prototype application and framework for remote visualizati...

Please sign up or login with your details

Forgot password? Click here to reset