Segmented correspondence curve regression model for quantifying reproducibility of high-throughput experiments

07/03/2018
by   Feipeng Zhang, et al.
0

The reliability of a high-throughput biological experiment relies highly on the settings of the operational factors in its experimental and data-analytic procedures. Understanding how operational factors influence the reproducibility of the experimental outcome is critical for constructing robust workflows and obtaining reliable results. One challenge in this area is that candidates at different levels of significance may respond to the operational factors differently. To model this heterogeneity, we develop a novel segmented regression model, based on the rank concordance between candidates from different replicate samples, to characterize the varying effects of operational factors for candidates at different levels of significance. A grid search method is developed to identify the change point in response to the operational factors and estimate the covariate effects accounting for the change. A sup-likelihood-ratio-type test is proposed to test the existence of a change point. Simulation studies show that our method yields a well-calibrated type I error, is powerful in detecting the difference in reproducibility, and achieves a better model fitting than the existing method. An application on a ChIP-seq dataset reveals interesting insights on how sequencing depth affects the reproducibility of experimental results, demonstrating the usefulness of our method in designing cost-effective and reliable high-throughput workflows.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/02/2022

Quantifying the Reproducibility of Cell-Perturbation Experiments

Experiments adhering to the same protocol can nonetheless lead to differ...
research
05/09/2018

Parameter estimation for high dimensional change point regression models without grid search

We propose an L1 regularized estimator for the parameters of a high dime...
research
12/31/2020

CauchyCP: a powerful test under non-proportional hazards using Cauchy combination of change-point Cox regressions

Non-proportional hazards data are routinely encountered in randomized cl...
research
09/23/2021

Optimal Decision Making in High-Throughput Virtual Screening Pipelines

Effective selection of the potential candidates that meet certain condit...
research
09/04/2021

Modeling the Evolution of Infectious Diseases with Functional Data Models: The Case of COVID-19 in Brazil

In this paper, we apply statistical methods for functional data to expla...
research
03/15/2021

SEMgraph: An R Package for Causal Network Analysis of High-Throughput Data with Structural Equation Models

With the advent of high-throughput sequencing (HTS) in molecular biology...
research
09/06/2021

Robust Narrowest Significance Pursuit: inference for multiple change-points in the median

We propose Robust Narrowest Significance Pursuit (RNSP), a methodology f...

Please sign up or login with your details

Forgot password? Click here to reset