Vision2Code: A Multi-Domain Benchmark for Evaluating Image-to-Code Generation

ArXi:2605.11307v1 Announce Type: cross Image-to-code generation tests whether a vision-language model (VLM) can recover the structure of an image enough to express it as executable code. Existing benchmarks either focus on narrow visual domains, depend on paired executable reference code, or rely on generic rubrics that miss domain-specific reconstruction errors. We