Mobile-R1: Towards Interactive Capability for VLM-Based Mobile Agent via Systematic Training
arXiv:2506.20332v4 Announce Type: replace
Abstract: Vision-language model-based mobile agents have gained the ability to understand complex instructions and mobile screenshots, benefiting from reinforcement learning paradigms like Group Relative Polic…