STAGE: A Full-Screenplay Benchmark for Reasoning over Evolving Storie

ArXi:2601.08510v3 Announce Type: replace Movie screenplays are rich long-form narratives that interleave complex character relationships, temporally ordered events, and dialogue-driven interactions. While prior benchmarks target individual subtasks such as question answering or dialogue generation, they rarely evaluate whether models can construct a coherent story world and use it consistently across multiple forms of reasoning and generation. We