Test-Time Matching: Unlocking Compositional Reasoning in Multimodal Models
arXiv:2510.07632v2 Announce Type: replace
Abstract: Frontier AI models have achieved remarkable progress, yet recent studies suggest they struggle with compositional reasoning, often performing at or below random chance on established benchmarks. We r…