Multi-modal user interface control detection using cross-attention
arXiv:2604.06934v1 Announce Type: cross
Abstract: Detecting user interface (UI) controls from software screenshots is a critical task for automated testing, accessibility, and software analytics, yet it remains challenging due to visual ambiguities, d…