Abstain-R1: Calibrated Abstention and Post-Refusal Clarification via Verifiable RL
arXiv:2604.17073v1 Announce Type: new
Abstract: Reinforcement fine-tuning improves the reasoning ability of large language models, but it can also encourage them to answer unanswerable queries by guessing or hallucinating missing information. Existing…