Martina G. Vilas, Timothy Schauml\"offel, Gemma Roig

Contextual inference from single objects in Vision-Language models

Martina G. Vilas, Timothy Schauml\"offel, Gemma Roig / March 31, 2026

arXiv:2603.26731v1 Announce Type: new
Abstract: How much scene context a single object carries is a well-studied question in human scene perception, yet how this capacity is organized in vision-language models (VLMs) remains poorly understood, with di…

Author name: Martina G. Vilas, Timothy Schauml\"offel, Gemma Roig

Contextual inference from single objects in Vision-Language models