Don’t Retrieve, Generate: Prompting LLMs for Synthetic Training Data in Dense Retrieval
arXiv:2504.21015v4 Announce Type: replace-cross
Abstract: Training effective dense retrieval models typically relies on hard negative (HN) examples mined from large document corpora using methods such as BM25 or cross-encoders, which require full corp…