Order-Explicit Linearization of High-Dimensional $U$-Statistics

arXiv:2405.07860v4 Announce Type: replace-cross Abstract: We give an order-explicit large deviation bound for the difference between a high-dimensional $U$-statistic and its H\'{a}jek projection. In particular, we show that any $U$-statistic of order $b$ on $n$ observations, with a $d$-dimensional kernel whose coordinates have $\psi_1$-Orlicz norm at most $\phi$, has a maximum deviation from its H\'{a}jek projection of order $O_p(\phi b n^{-1}\log^2(dn))$. The proof relies on the development of novel order-explicit moment inequalities for higher-order Hoeffding components. We show that this rate is unimprovable, up to the polynomial factor on the logarithmic term. As corollaries, we obtain new Bernstein-type concentration and Gaussian approximation results for high-dimensional $U$-statistics. We apply these results to establish the consistency of a set of resampling-based simultaneous confidence intervals built around a class of nonparametric regression estimators constructed with subsampled kernels. This class encompasses several forms of random forest regression, including Generalized Random Forests.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top