cs.CV, cs.LG

Vision2Code: A Multi-Domain Benchmark for Evaluating Image-to-Code Generation

arXiv:2605.11307v1 Announce Type: new
Abstract: Image-to-code generation tests whether a vision-language model (VLM) can recover the structure of an image enough to express it as executable code. Existing benchmarks either focus on narrow visual domai…