ViGoR-Bench: How Far Are Visual Generative Models From Zero-Shot Visual Reasoners?

ArXi:2603.25823v1 Announce Type: cross Beneath the stunning visual fidelity of modern AIGC models lies a "logical desert", where systems fail tasks that require physical, causal, or complex spatial reasoning. Current evaluations largely rely on superficial metrics or fragmented benchmarks, creating a ``performance mirage'' that overlooks the generative process. To address this, we