HaS: Accelerating RAG through Homology-Aware Speculative Retrieval
arXiv:2604.20452v1 Announce Type: cross
Abstract: Retrieval-Augmented Generation (RAG) expands the knowledge boundary of large language models (LLMs) at inference by retrieving external documents as context. However, retrieval becomes increasingly tim…