Jainum Sanghavi - Provide.ai

From Edges to Depth: Probing the Spatial Hierarchy in Vision Transformers

Jainum Sanghavi / April 28, 2026

arXiv:2604.23452v1 Announce Type: new
Abstract: Vision Transformers trained only on image classification routinely transfer to tasks that demand spatial understanding, yet they receive no spatial supervision during pretraining. We ask where and how ro…

Author name: Jainum Sanghavi

From Edges to Depth: Probing the Spatial Hierarchy in Vision Transformers