New Bounds for Kernel Sums via Fast Spherical Embeddings
arXiv:2605.01263v1 Announce Type: cross
Abstract: We study query time bounds for the fundamental problem of estimating the kernel mean $\frac1{|X|}\sum_{x\in X}\mathbf{k}(x,y)$ of a query $y$ in a finite dataset $X\subset\mathbb{R}^d$ up to a prescribed additive error $\varepsilon$. The best known bounds for the Gaussian kernel are $O(d/\varepsilon^2)$, $\widetilde O(d+1/\varepsilon^4)$, and $\widetilde O(d+\Delta^2/\varepsilon^2)$, where $\Delta$ is the diameter of a region containing the points. We prove the new bound $\tilde O(d+\varepsilon\Delta^2+1/\varepsilon^3)$, which improves over the previous ones in regimes with small error $\varepsilon$ and intermediate diameter $\Delta$.
At the center of our proof is a new fast spherical embedding theorem in the sense introduced by Bartal, Recht and Schulman (2011), which limits the embedded data diameter while preserving local Euclidean distances and avoiding ``distance collapse'' at larger scales. This fast embedding theorem may be of independent interest.