cs.AI, cs.LG, cs.SE

MobiFlow: Real-World Mobile Agent Benchmarking through Trajectory Fusion

arXiv:2604.09587v1 Announce Type: cross
Abstract: Mobile agents can autonomously complete user-assigned tasks through GUI interactions. However, existing mainstream evaluation benchmarks, such as AndroidWorld, operate by connecting to a system-level A…