On the Use of Random Forest for Two-Sample Testing

03/14/2019
by   Simon Hediger, et al.
0

We follow the line of using classifiers for two-sample testing and propose several tests based on the Random Forest classifier. The developed tests are easy to use, require no tuning and are applicable for any distribution on R^p, even in high-dimensions. We provide a comprehensive treatment for the use of classification for two-sample testing, derive the distribution of our tests under the Null and provide a power analysis, both in theory and with simulations. To simplify the use of the method, we also provide the R-package "hypoRF".

READ FULL TEXT
research
04/29/2020

Asymptotic Properties of High-Dimensional Random Forests

As a flexible nonparametric learning tool, random forest has been widely...
research
10/26/2020

Data Segmentation via t-SNE, DBSCAN, and Random Forest

This research proposes a data segmentation technique which is easy to in...
research
10/13/2020

Automation of Hemocompatibility Analysis Using Image Segmentation and a Random Forest

The hemocompatibility of blood-contacting medical devices remains one of...
research
06/02/2022

Sequential Permutation Testing of Random Forest Variable Importance Measures

Hypothesis testing of random forest (RF) variable importance measures (V...
research
06/19/2018

vsgoftest: An Package for Goodness-of-Fit Testing Based on Kullback-Leibler Divergence

The R-package vsgoftest performs goodness-of-fit (GOF) tests, based on S...
research
10/20/2016

Revisiting Classifier Two-Sample Tests

The goal of two-sample tests is to assess whether two samples, S_P ∼ P^n...

Please sign up or login with your details

Forgot password? Click here to reset