cs.AI, cs.LG

Automated Attention Pattern Discovery at Scale in Large Language Models

arXiv:2604.03764v1 Announce Type: cross
Abstract: Large language models have found success by scaling up capabilities to work in general settings. The same can unfortunately not be said for interpretability methods. The current trend in mechanistic in…