AI RESEARCH
Attention-Guided Reward for Reinforcement Learning-based Jailbreak against Large Reasoning Models
arXiv CS.AI
•
ArXi:2605.19485v1 Announce Type: new Large Reasoning Models (LRMs) have nstrated remarkable capabilities in solving complex problems by generating structured, step-by-step reasoning content. However, exposing a model's internal reasoning process