FlashVGGT: Efficient and Scalable Visual Geometry Transformers with Compressed Descriptor Attention
arXiv:2512.01540v2 Announce Type: replace
Abstract: 3D reconstruction from multi-view images is a core challenge in computer vision. Recently, feed-forward methods have emerged as efficient and robust alternatives to traditional per-scene optimization…