Abstain-R1: Calibrated Abstention and Post-Refusal Clarification via Verifiable RL

ArXi:2604.17073v1 Announce Type: new Reinforcement fine-tuning improves the reasoning ability of large language models, but it can also encourage them to answer unanswerable queries by guessing or hallucinating missing information. Existing abstention methods either train models to produce generic refusals or encourage follow-up clarifications without verifying whether those clarifications identify the key missing information.