Geometric Factual Recall in Transformers
arXiv:2605.12426v1 Announce Type: new
Abstract: How do transformer language models memorize factual associations? A common view casts internal weight matrices as associative memories over pairs of embeddings, requiring parameter counts that scale line…