Permutation test

12/5/2023

For example, if we want to test that the slope of a linear regression is nonzero, a sensible test statistic is the ordinary least squares estimate of the slope. Permutation testing can also be applied to regression setting. For example, if our goal is to test for significant differences between two groups of samples, then a sensible test statistic is the difference in sample mean.

What should the test statistic look like? It’ll depend on what our data looks like, and what hypothesis we aim to test. Next, we see what proportion of these null test statistics are more extreme than our observed test statistic for the original data, and this becomes our p-value. By re-computing the test statistic for each permutation, we can build a “null distribution” of test statistics. Next, we permute the predictor and label pairs. We compute some test statistic from these predictor/label pairs (this test statistic should tell us something about the relationship between predictor and label). In our original data, each predictor is paired with a label. The actual procedure behind permutation testing is very straightforward. There are scenarios in which we don’t have labels for our samples (for example, in unsupervised learning), so of course permutation testing does not apply to these settings. Permutation testing applies specifically to scenarios in which we have a predictor and label, and we would like to test the hypothesis that there is no relationship between the two things.

There already exist a lot of resources out there on permutation testing, so information in this section can be found basically anywhere. Most of the information from this blog post was drawn from a fantastic online resource from David Howell (link here): Overview of permutation testing WHY is it valid to just permute your sample/label pairs to construct an empirical null distribution? When is this procedure not valid? How does permutation testing compare with parametric hypothesis testing, and in what scenarios would we want to use either? This blog post represents my attempt to answer these questions in a way that is both non-technical and as detailed as possible. However, for a long time I felt like I never really understood permutation tests. I’ve also performed permutation testing myself in my own research. There already exist many resources out there that explain the procedure behind permutation testing. Permutation testing is a very widely used tool to perform hypothesis testing.

0 Comments

Permutation test

Leave a Reply.

Author

Archives

Categories