AI RESEARCH
Omni-WorldBench: Towards a Comprehensive Interaction-Centric Evaluation for World Models
arXiv CS.CV
•
ArXi:2603.22212v1 Announce Type: new Video--based world models have emerged along two dominant paradigms: video generation and 3D reconstruction. However, existing evaluation benchmarks either focus narrowly on visual fidelity and text--video alignment for generative models, or rely on static 3D reconstruction metrics that fundamentally neglect temporal dynamics. We argue that the future of world modeling lies in 4D generation, which jointly models spatial structure and temporal evolution.