Rapid and Scalable Bayesian AB Testing

by   Srivas Chennu, et al.

AB testing aids business operators with their decision making, and is considered the gold standard method for learning from data to improve digital user experiences. However, there is usually a gap between the requirements of practitioners, and the constraints imposed by the statistical hypothesis testing methodologies commonly used for analysis of AB tests. These include the lack of statistical power in multivariate designs with many factors, correlations between these factors, the need of sequential testing for early stopping, and the inability to pool knowledge from past tests. Here, we propose a solution that applies hierarchical Bayesian estimation to address the above limitations. In comparison to current sequential AB testing methodology, we increase statistical power by exploiting correlations between factors, enabling sequential testing and progressive early stopping, without incurring excessive false positive risk. We also demonstrate how this methodology can be extended to enable the extraction of composite global learnings from past AB tests, to accelerate future tests. We underpin our work with a solid theoretical framework that articulates the value of hierarchical estimation. We demonstrate its utility using both numerical simulations and a large set of real-world AB tests. Together, these results highlight the practical value of our approach for statistical inference in the technology industry.


page 1

page 2

page 3

page 4


On Frequentist and Bayesian Sequential Clinical Trial Designs

Clinical trials usually involve sequential patient entry. When designing...

Datasets for Online Controlled Experiments

Online Controlled Experiments (OCE) are the gold standard to measure imp...

A Rademacher Complexity Based Method fo rControlling Power and Confidence Level in Adaptive Statistical Analysis

While standard statistical inference techniques and machine learning gen...

Exact Inference of Linear Dependence Between Multiple Autocorrelated Time Series

The ability to quantify complex relationships within multivariate time s...

On Error Exponents of Almost-Fixed-Length Channel Codes and Hypothesis Tests

We examine a new class of channel coding strategies, and hypothesis test...

Experimentation Platforms Meet Reinforcement Learning: Bayesian Sequential Decision-Making for Continuous Monitoring

With the growing needs of online A/B testing to support the innovation i...

Classification Accuracy as a Proxy for Two Sample Testing

When data analysts train a classifier and check if its accuracy is signi...

Please sign up or login with your details

Forgot password? Click here to reset