WebWorld is a large-scale open-web world model series for training and evaluating web agents. It is trained on 1M+ real-world web interaction trajectories via a scalable hierarchical data pipeline, supporting:
- Long-horizon simulation (30+ steps)
- Multi-format state representations: A11y Tree, HTML, XML, Markdown, and natural language
- CoT-activated reasoning for transition prediction
- Cross-domain generalization to code, GUI, and game environments
Agents trained on WebWorld-synthesized trajectories achieve +9.9% on MiniWob++ and +10.9% on WebArena. When used for inference-time lookahead search, WebWorld outperforms GPT-5 as a world model.
https://huggingface.co/Qwen/WebWorld-32B
[link] [comments]