Volume Transformer: Revisiting Vanilla Transformers for 3D Scene Understanding
arXiv:2604.19609v1 Announce Type: new
Abstract: Transformers have become a common foundation across deep learning, yet 3D scene understanding still relies on specialized backbones with strong domain priors. This keeps the field isolated from the broad…