Clustering with Uniformity- and Neighbor-Based Random Geometric Graphs
arXiv:2501.06268v3 Announce Type: replace-cross
Abstract: We propose a graph-based clustering method based on Cluster Catch Digraphs (CCDs) that extends their applicability to
moderate-dimensional data settings. Existing CCD variants, such as RK-CCDs, rely on spatial randomness tests based on
Ripley's K function, which exhibit performance degradation as dimensionality increases. To address this limitation, we
introduce a nearest-neighbor-distance (NND) based Monte Carlo spatial randomness test (MC-SRT) for determining
covering radii, resulting in the proposed Uniformity- and Neighbor-based CCDs (UN-CCDs). The proposed method is
designed for datasets of moderate size and dimension, particularly in settings with complex cluster geometry and
uniformly distributed background noise. Through Monte Carlo simulations and experiments on benchmark datasets, we show
that UN-CCDs provide stable and competitive performance relative to several established clustering methods within the
evaluated regimes, while remaining largely parameter-free. We also discuss computational trade-offs and identify the
practical regimes in which the method is most effective. -- Keywords:
Graph-based clustering; Cluster catch digraphs; Moderate-dimensional data; the nearest neighbor distance; Spatial
randomness test.