PRPO: Paragraph-level Policy Optimization for Vision-Language Deepfake Detection

ArXi:2509.26272v3 Announce Type: replace-cross The rapid rise of synthetic media has made deepfake detection a critical challenge for online safety and trust. Progress remains constrained by the scarcity of large, high-quality datasets. Although multimodal large language models (LLMs) exhibit strong reasoning capabilities, their performance on deepfake detection is poor, often producing explanations that are misaligned with visual evidence or hallucinatory. To address this limitation, we