Spectral Geometry of LoRA Adapters Encodes Training Objective and Predicts Harmful Compliance
arXiv:2604.08844v1 Announce Type: new
Abstract: We study whether low-rank spectral summaries of LoRA weight deltas can identify which fine-tuning objective was applied to a language model, and whether that geometric signal predicts downstream behavior…