GIR-Bench: Versatile Benchmark for Generating Images with Reasoning

ArXi:2510.11026v2 Announce Type: replace Unified multimodal models integrate the reasoning capacity of large language models with both image understanding and generation, showing great promise for advanced multimodal intelligence. However, the community still lacks a rigorous reasoning-centric benchmark to systematically evaluate the alignment between understanding and generation, and their generalization potential in complex visual tasks. To this end, we