Rank-transformed subsampling: inference for multiple data splitting and exchangeable p-values

01/06/2023
by   F. Richard Guo, et al.
0

Many testing problems are readily amenable to randomised tests such as those employing data splitting, which divide the data into disjoint parts for separate purposes. However despite their usefulness in principle, randomised tests have obvious drawbacks. Firstly, two analyses of the same dataset may lead to different results. Secondly, the test typically loses power because it does not fully utilise the entire sample. As a remedy to these drawbacks, we study how to combine the test statistics or p-values resulting from multiple random realisations such as through random data splits. We introduce rank-transformed subsampling as a general method for delivering large sample inference about the combined statistic or p-value under mild assumptions. We apply our methodology to a range of problems, including testing unimodality in high-dimensional data, testing goodness-of-fit of parametric quantile regression models, testing no direct effect in a sequentially randomised trial and calibrating cross-fit double machine learning confidence intervals. For the latter, our method improves coverage in finite samples and for the testing problems, our method is able to derandomise and improve power. Moreover, in contrast to existing p-value aggregation schemes that can be highly conservative, our method enjoys type-I error control that asymptotically approaches the nominal level.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/29/2021

Multiple-Splitting Projection Test for High-Dimensional Mean Vectors

We propose a multiple-splitting projection test (MPT) for one-sample mea...
research
09/04/2019

Group Inference in High Dimensions with Applications to Hierarchical Testing

Group inference has been a long-standing question in statistics and the ...
research
10/24/2022

E-Valuating Classifier Two-Sample Tests

We propose E-C2ST, a classifier two-sample test for high-dimensional dat...
research
02/26/2019

A Family of Exact Goodness-of-Fit Tests for High-Dimensional Discrete Distributions

The objective of goodness-of-fit testing is to assess whether a dataset ...
research
03/09/2023

Simulation-based, Finite-sample Inference for Privatized Data

Privacy protection methods, such as differentially private mechanisms, i...
research
11/01/2019

Exact model comparisons in the plausibility framework

Plausibility is a formalization of exact tests for parametric models and...
research
11/30/2021

Application of Equal Local Levels to Improve Q-Q Plot Testing Bands with R Package qqconf

Quantile-Quantile (Q-Q) plots are often difficult to interpret because i...

Please sign up or login with your details

Forgot password? Click here to reset