OMIBench: Benchmarking Olympiad-Level Multi-Image Reasoning in Large Vision-Language Model

ArXi:2604.20806v1 Announce Type: cross Large vision-language models (LVLMs) have made substantial advances in reasoning tasks at the Olympiad level. Nevertheless, current Olympiad-level multimodal reasoning benchmarks for these models often emphasize single-image analysis and fail to exploit contextual information across multiple images. We present OMIBench, a benchmark designed to evaluate Olympiad-level reasoning when the required evidence is distributed over multiple images.