cs.CV

Probing CLIP’s Comprehension of 360-Degree Textual and Visual Semantics

arXiv:2604.24642v1 Announce Type: new
Abstract: The dream of instantly creating rich 360-degree panoramic worlds from text is rapidly becoming a reality, yet a crucial gap exists in our ability to reliably evaluate their semantic alignment. Contrastiv…