cs.CR, cs.LG

Mechanistic Anomaly Detection via Functional Attribution

arXiv:2604.18970v1 Announce Type: new
Abstract: We can often verify the correctness of neural network outputs using ground truth labels, but we cannot reliably determine whether the output was produced by normal or anomalous internal mechanisms. Mecha…