K-Means as a Radial Basis function Network: a Variational and Gradient-based Equivalence [R]

K Means is basically an RBF network

I have been working on a formulation of K Means as a continuous optimization problem instead of a discrete algorithm. The idea is to replace hard assignments with soft responsibilities and define a smooth objective that preserves the clustering structure while making the system fully differentiable and trainable end to end.

The main result is a Gamma convergence analysis showing that this objective recovers standard K Means in the zero temperature limit. So the usual alternating updates are not fundamental, they emerge from a continuous variational problem when the smoothing vanishes.

This also gives a precise connection with Radial Basis Function networks. Under this formulation, centers, assignments, and loss are part of the same objective, and the difference between clustering and a neural model is just the level of smoothness.

One thing I find interesting is that this removes the need to treat clustering as a separate block. In principle it can be embedded directly inside larger models and optimized jointly, although it is not obvious how stable or useful that is in practice.

I would be interested in critical feedback on both sides. On the theory side, whether the variational argument is actually tight or missing edge cases. On the practical side, whether this end to end view of clustering is something people would actually use or if standard K Means remains strictly better in real systems.

submitted by /u/Ffelixpe
[link] [comments]

Leave a Comment