AI RESEARCH

IPS: In-Prompt Process Supervision for Short Video Content Moderation

arXiv CS.CL

ArXi:2412.15251v3 Announce Type: replace Multimodal large language models (MLLMs) are effective at capturing the semantics of short video content; however, they often fail to attend to the policy-specific details required for reliable content moderation. To address this limitation, we