THEMIS: Towards Holistic Evaluation of MLLMs for Scientific Paper Fraud Forensics

ArXi:2603.25089v1 Announce Type: new We present THEMIS, a novel multi-task benchmark designed to comprehensively evaluate multimodal large language models (MLLMs) on visual fraud reasoning within real-world academic scenarios. Compared to existing benchmarks