CausalGaze: Unveiling Hallucinations via Counterfactual Graph Intervention in Large Language Models

ArXi:2604.11087v1 Announce Type: new Despite the groundbreaking advancements made by large language models (LLMs), hallucination remains a critical bottleneck for their deployment in high-stakes domains. Existing classification-based methods mainly rely on static and passive signals from internal states, which often captures the noise and spurious correlations, while overlooking the underlying causal mechanisms. To address this limitation, we shift the paradigm from passive observation to active intervention by.