Ulterior Motives: Detecting Misaligned Reasoning in Continuous Thought Models
arXiv:2604.23460v1 Announce Type: new
Abstract: Chain-of-Thought (CoT) reasoning has emerged as a key technique for eliciting complex reasoning in Large Language Models (LLMs). Although interpretable, its dependence on natural language limits the mode…