Can Vision-Language Models Think from the Sky? Unifying UAV Reasoning and Generation
arXiv:2604.05377v2 Announce Type: replace
Abstract: Vision-Language Models have achieved strong progress in ground-view visual understanding, yet they remain brittle in high-altitude Unmanned Aerial Vehicle scenes, where objects are tiny and densely p…