Vibe Code Bench: Evaluating AI Models on End-to-End Web Application Development
arXiv:2603.04601v2 Announce Type: replace-cross
Abstract: Code generation has emerged as one of AI’s highest-impact use cases, yet existing benchmarks measure isolated tasks rather than the complete “zero-to-one” process of building a working applicat…