Learn where to Click from Yourself: On-Policy Self-Distillation for GUI Grounding
arXiv:2605.00642v3 Announce Type: replace
Abstract: Graphical User Interface (GUI) grounding maps natural language instructions to the visual coordinates of target elements and serves as a core capability for autonomous GUI agents. Recent reinforcemen…