E2Edev: Benchmarking Large Language Models in End-to-End Software Development Task
arXiv:2510.14509v4 Announce Type: replace-cross
Abstract: The rapid advancement in large language models (LLMs) has demonstrated significant potential in End-to-End Software Development (E2ESD). However, existing E2ESD benchmarks are limited by coarse…