Harnessing Reasoning Trajectories for Hallucination Detection via Answer-agreement Representation Shaping

ArXi:2601.17467v2 Announce Type: replace Large reasoning models (LRMs) often generate long, seemingly coherent reasoning traces yet still produce incorrect answers, making hallucination detection challenging. Although trajectories contain useful signals, directly using trace text or vanilla hidden states for detection is brittle: traces vary in form and detectors can overfit to superficial patterns rather than answer validity. We