Vision2Web: A Hierarchical Benchmark for Visual Website Development with Agent Verification
arXiv:2603.26648v2 Announce Type: replace-cross
Abstract: Recent advances in large language models have improved the capabilities of coding agents, yet systematic evaluation of complex, end-to-end website development remains limited. To address this g…