Privacy Collapse: Benign Fine-Tuning Can Break Contextual Privacy in Language Models
arXiv:2601.15220v2 Announce Type: replace
Abstract: We identify a novel phenomenon in language models: benign fine-tuning of frontier models can lead to privacy collapse. We find that diverse, subtle patterns in training data can degrade contextual pr…