AI RESEARCH

A Scalable Nystrom-Based Kernel Two-Sample Test with Permutations

arXiv CS.LG

ArXi:2502.13570v4 Announce Type: replace-cross Two-sample hypothesis testing-determining whether two sets of data are drawn from the same distribution-is a fundamental problem in statistics and machine learning with broad scientific applications. In the context of nonparametric testing, maximum mean discrepancy (MMD) has gained popularity as a test statistic due to its flexibility and strong theoretical foundations. However, its use in large-scale scenarios is plagued by high computational costs.