AI RESEARCH

SceneCritic: A Symbolic Evaluator for 3D Indoor Scene Synthesis

arXiv CS.CL

ArXi:2604.13035v1 Announce Type: cross Large Language Models (LLMs) and Vision-Language Models (VLMs) increasingly generate indoor scenes through intermediate structures such as layouts and scene graphs, yet evaluation still relies on LLM or VLM judges that score rendered views, making judgments sensitive to viewpoint, prompt phrasing, and hallucination. When the evaluator is unstable, it becomes difficult to determine whether a model has produced a spatially plausible scene or whether the output score reflects the choice of viewpoint, rendering, or prompt. We.