Computational Performance of a Germline Variant Calling Pipeline for Next Generation Sequencing

04/01/2020
by   Jie Liu, et al.
0

With the booming of next generation sequencing technology and its implementation in clinical practice and life science research, the need for faster and more efficient data analysis methods becomes pressing in the field of sequencing. Here we report on the evaluation of an optimized germline mutation calling pipeline, HummingBird, by assessing its performance against the widely accepted BWA-GATK pipeline. We found that the HummingBird pipeline can significantly reduce the running time of the primary data analysis for whole genome sequencing and whole exome sequencing while without significantly sacrificing the variant calling accuracy. Thus, we conclude that expansion of such software usage will help to improve the primary data analysis efficiency for next generation sequencing.

READ FULL TEXT

page 4

page 5

research
06/03/2018

Design and evaluation of a genomics variant analysis pipeline using GATK Spark tools

Scalable and efficient processing of genome sequence data, i.e. for vari...
research
04/30/2023

Accelerating Genome Analysis via Algorithm-Architecture Co-Design

High-throughput sequencing (HTS) technologies have revolutionized the fi...
research
04/12/2019

Guidelines for data analysis scripts

Unorganized heaps of analysis code are a growing liability as data analy...
research
04/26/2018

Machine Learning pipeline for discovering neuroimaging-based biomarkers in neurology and psychiatry

We consider a problem of diagnostic pattern recognition/classification f...
research
07/12/2021

In-Database Regression in Input Sparsity Time

Sketching is a powerful dimensionality reduction technique for accelerat...
research
09/06/2023

Automated Bioinformatics Analysis via AutoBA

With the fast-growing and evolving omics data, the demand for streamline...

Please sign up or login with your details

Forgot password? Click here to reset