A Decision-Theoretic Formalisation of Steganography With Applications to LLM Monitoring
arXiv:2602.23163v3 Announce Type: replace-cross
Abstract: Large language models are beginning to show steganographic capabilities. Such capabilities could allow misaligned models to evade oversight mechanisms. Yet principled methods to detect and quan…