Parallel solver for shifted systems in a hybrid CPU-GPU framework

by   Nela Bosner, et al.

This paper proposes a combination of a hybrid CPU--GPU and a pure GPU software implementation of a direct algorithm for solving shifted linear systems (A - σ I)X = B with large number of complex shifts σ and multiple right-hand sides. Such problems often appear e.g. in control theory when evaluating the transfer function, or as a part of an algorithm performing interpolatory model reduction, as well as when computing pseudospectra and structured pseudospectra, or solving large linear systems of ordinary differential equations. The proposed algorithm first jointly reduces the general full n× n matrix A and the n× m full right-hand side matrix B to the controller Hessenberg canonical form that facilitates efficient solution: A is transformed to a so-called m-Hessenberg form and B is made upper-triangular. This is implemented as blocked highly parallel CPU--GPU hybrid algorithm; individual blocks are reduced by the CPU, and the necessary updates of the rest of the matrix are split among the cores of the CPU and the GPU. To enhance parallelization, the reduction and the updates are overlapped. In the next phase, the reduced m-Hessenberg--triangular systems are solved entirely on the GPU, with shifts divided into batches. The benefits of such load distribution are demonstrated by numerical experiments. In particular, we show that our proposed implementation provides an excellent basis for efficient implementations of computational methods in systems and control theory, from evaluation of transfer function to the interpolatory model reduction.


page 8

page 9

page 11

page 12

page 18

page 22


SParSH-AMG: A library for hybrid CPU-GPU algebraic multigrid and preconditioned iterative methods

Hybrid CPU-GPU algorithms for Algebraic Multigrid methods (AMG) to effic...

Parallel Prony's method with multivariate matrix pencil approach and its numerical aspect

Prony's method is a standard tool exploited for solving many imaging and...

Simultaneous Solving of Batched Linear Programs on a GPU

Linear Programs (LPs) appear in a large number of applications and offlo...

Hybrid Fortran: High Productivity GPU Porting Framework Applied to Japanese Weather Prediction Model

In this work we use the GPU porting task for the operative Japanese weat...

Optimized Multivariate Polynomial Determinant on GPU

We present an optimized algorithm calculating determinant for multivaria...

Communication-free and Parallel Simulation of Neutral Biodiversity Models

We present a novel communication-free algorithm for individual-based pro...

Parallel Implementations of Cellular Automata for Traffic Models

The Biham-Middleton-Levine (BML) traffic model is a simple two-dimension...

Please sign up or login with your details

Forgot password? Click here to reset