General Purpose Graphics Processing Units (GPGPU) are used in most of th...
Iterative stencils are used widely across the spectrum of High Performan...
In this humorous and thought provoking article, we discuss certain myths...
Flash-X is a highly composable multiphysics software system that can be ...
Ptychography is a popular microscopic imaging modality for many scientif...
A considerable amount of research and engineering went into designing pr...
Iterative memory-bound solvers commonly occur in HPC codes. Typical GPU
...
Scientific communities are increasingly adopting machine learning and de...
Computed Tomography (CT) is a key 3D imaging technology that fundamental...
Deep Neural Network (DNN) frameworks use distributed training to enable
...
Matrix engines or units, in different forms and affinities, are becoming...
The dedicated memory of hardware accelerators can be insufficient to sto...
We propose ParDNN, an automatic, generic, and non-intrusive partitioning...
GPUs are playing an increasingly important role in general-purpose compu...
Stencil computation is one of the most widely-used compute patterns in h...
Computed Tomography (CT) is a widely used technology that requires
compu...
This paper proposes a versatile high-performance execution model, inspir...
Among the (uncontended) common wisdom in High-Performance Computing (HPC...