cs.CL

WebMall — A Multi-Shop Benchmark for Evaluating Web Agents

arXiv:2508.13024v3 Announce Type: replace
Abstract: LLM-based web agents have the potential to automate long-running web tasks, such as searching for products in multiple e-shops and subsequently ordering the cheapest products that meet the users need…