CRISP: Persistent Concept Unlearning via Sparse Autoencoders
arXiv:2508.13650v3 Announce Type: replace
Abstract: As large language models (LLMs) are increasingly deployed in real-world applications, the need to selectively remove unwanted knowledge while preserving model utility has become paramount. Recent wor…