AtManRL: Towards Faithful Reasoning via Differentiable Attention Saliency
arXiv:2604.16158v1 Announce Type: cross
Abstract: Large language models (LLMs) increasingly rely on chain-of-thought (CoT) reasoning to solve complex tasks. Yet ensuring that the reasoning trace both contributes to and faithfully reflects the processe…