高维独立性检验的强大自助法测试

A Powerful Bootstrap Test of Independence in High Dimensions

摘要 Abstract

本文提出了一种非参数独立性检验方法,用于检验一个随机变量是否与其他大量随机变量相互独立。该检验统计量为多个Chatterjee秩相关系数的最大值,临界值通过块乘法自助法计算。理论分析表明,即使变量个数远大于样本容量,该检验在一大类数据生成过程中仍能渐近控制检验水平,并且对任何固定的替代假设一致。此外,该检验可以结合逐步程序选择违反独立性的变量集合,同时控制族错误率。所有正式结果均未对池中变量间的依赖关系施加任何限制。模拟结果显示,我们的检验方法非常强大,在大多数考虑的情景下优于现有方法,尤其是在高维情形或池中变量存在依赖关系时。

This paper proposes a nonparametric test of independence of one random variable from a large pool of other random variables. The test statistic is the maximum of several Chatterjee's rank correlations and critical values are computed via a block multiplier bootstrap. The test is shown to asymptotically control size uniformly over a large class of data-generating processes, even when the number of variables is much larger than sample size. The test is consistent against any fixed alternative. It can be combined with a stepwise procedure for selecting those variables from the pool that violate independence, while controlling the family-wise error rate. All formal results leave the dependence among variables in the pool completely unrestricted. In simulations, we find that our test is very powerful, outperforming existing tests in most scenarios considered, particularly in high dimensions and/or when the variables in the pool are dependent.

高维独立性检验的强大自助法测试 - arXiv