AI RESEARCH

ReplaySCM: A Benchmark for Executable Causal Mechanism Induction from Interventions

arXiv CS.AI

ArXi:2605.08197v1 Announce Type: cross Most causal benchmarks for language models score local answers or graph structure. We