LatentDiff: Scaling Semantic Dataset Comparison to Millions of Images
arXiv:2605.00899v1 Announce Type: new
Abstract: We present LatentDiff, a scalable framework for semantic dataset comparison that operates directly in the latent space of pretrained vision encoders. By combining sparse autoencoder-based divergence test…