cs.AI, cs.CL, cs.SE

E2Edev: Benchmarking Large Language Models in End-to-End Software Development Task

arXiv:2510.14509v4 Announce Type: replace-cross
Abstract: The rapid advancement in large language models (LLMs) has demonstrated significant potential in End-to-End Software Development (E2ESD). However, existing E2ESD benchmarks are limited by coarse…