Critical-CoT: A Robust Defense Framework against Reasoning-Level Backdoor Attacks in Large Language Models
arXiv:2604.10681v1 Announce Type: cross
Abstract: Large Language Models (LLMs), despite their impressive capabilities across domains, have been shown to be vulnerable to backdoor attacks. Prior backdoor strategies predominantly operate at the token le…