A Scalable and Energy Efficient GPU Thread Map for m-Simplex Domains

by   Cristóbal A. Navarro, et al.

This work proposes a new GPU thread map for m-simplex domains, that scales its speedup with dimension and is energy efficient compared to other state of the art approaches. The main contributions of this work are i) the formulation of the new block-space map ℋ: ℤ^m ↦ℤ^m for regular orthogonal simplex domains, which is analyzed in terms of resource usage, and ii) the experimental evaluation in terms of speedup over a bounding box approach and energy efficiency as elements per second per Watt. Results from the analysis show that ℋ has a potential speedup of up to 2× and 6× for 2 and 3-simplices, respectively. Experimental evaluation shows that ℋ is competitive for 2-simplices, reaching 1.2×∼ 2.0× of speedup for different tests, which is on par with the fastest state of the art approaches. For 3-simplices ℋ reaches up to 1.3×∼ 6.0× of speedup making it the fastest of all. The extension of ℋ to higher dimensional m-simplices is feasible and has a potential speedup that scales as m! given a proper selection of parameters r, β which are the scaling and replication factors, respectively. In terms of energy consumption, although ℋ is among the highest in power consumption, it compensates by its short duration, making it one of the most energy efficient approaches. Lastly, further improvements with Tensor and Ray Tracing Cores are analyzed, giving insights to leverage each one of them. The results obtained in this work show that ℋ is a scalable and energy efficient map that can contribute to the efficiency of GPU applications when they need to process m-simplex domains, such as Cellular Automata or PDE simulations.


page 6

page 9

page 11


Efficient GPU Thread Mapping on Embedded 2D Fractals

This work proposes a new approach for mapping GPU threads onto a family ...

Accelerating Range Minimum Queries with Ray Tracing Cores

During the last decade GPU technology has shifted from pure general purp...

Energy Efficiency of Web Browsers in the Android Ecosystem

This paper presents an empirical study regarding the energy consumption ...

Power Consumption Analysis of Parallel Algorithms on GPUs

Due to their highly parallel multi-cores architecture, GPUs are being in...

Going green: optimizing GPUs for energy efficiency through model-steered auto-tuning

Graphics Processing Units (GPUs) have revolutionized the computing lands...

Evaluating the Energy Measurements of the IBM POWER9 On-Chip Controller

Dependable power measurements are the backbone of energy-efficient compu...

Hyperbolic Diffusion in Flux Reconstruction: Optimisation through Kernel Fusion within Tensor-Product Elements

Novel methods are presented in this initial study for the fusion of GPU ...

Please sign up or login with your details

Forgot password? Click here to reset