Scrooge: A Fast and Memory-Frugal Genomic Sequence Aligner for CPUs, GPUs, and ASICs

by   Joël Lindegger, et al.

Motivation: Pairwise sequence alignment is a very time-consuming step in common bioinformatics pipelines. Speeding up this step requires heuristics, efficient implementations and/or hardware acceleration. A promising candidate for all of the above is the recently proposed GenASM algorithm. We identify and address three inefficiencies in the GenASM algorithm: it has a high amount of data movement, a large memory footprint, and does some unnecessary work. Results: We propose Scrooge, a fast and memory-frugal genomic sequence aligner. Scrooge includes three novel algorithmic improvements which reduce the data movement, memory footprint, and the number of operations in the GenASM algorithm. We provide efficient open-source implementations of the Scrooge algorithm for CPUs and GPUs, which demonstrate the significant benefits of our algorithmic improvements. For long reads the CPU version of Scrooge achieves a 15x, 1.7x, and 1.9x speedup over KSW2, Edlib, and a CPU implementation of GenASM, respectively. The GPU version of Scrooge achieves a 4.2x 63x, 7.4x, 11x and 5.9x speedup over the CPU version of Scrooge, KSW2, Edlib, Darwin-GPU, and a GPU implementation of GenASM, respectively. We estimate an ASIC implementation of Scrooge to use 3.6x less chip area and 2.1x less power than a GenASM ASIC while maintaining the same throughput. Further, we systematically analyze the throughput and accuracy behavior of GenASM and Scrooge under a variety of configurations. As the optimal configuration of Scrooge depends on the computing platform, we make several observations that can help guide future implementations of Scrooge. Availability and implementation:


Algorithmic Improvement and GPU Acceleration of the GenASM Algorithm

We improve on GenASM, a recent algorithm for genomic sequence alignment,...

A single-tree algorithm to compute the Euclidean minimum spanning tree on GPUs

Computing the Euclidean minimum spanning tree (EMST) is a computationall...

GPGPU Acceleration of the KAZE Image Feature Extraction Algorithm

The recently proposed open-source KAZE image feature detection and descr...

Accelerating Polynomial Multiplication for Homomorphic Encryption on GPUs

Homomorphic Encryption (HE) enables users to securely outsource both the...

m-CUBES An efficient and portable implementation of multi-dimensional integration for gpus

The task of multi-dimensional numerical integration is frequently encoun...

Bringing UMAP Closer to the Speed of Light with GPU Acceleration

The Uniform Manifold Approximation and Projection (UMAP) algorithm has b...

Please sign up or login with your details

Forgot password? Click here to reset