Leveraging RAG for Training-Free Alignment of LLMs
arXiv:2605.11217v1 Announce Type: new
Abstract: Large language model (LLM) alignment algorithms typically consist of post-training over preference pairs. While such algorithms are widely used to enable safety guardrails and align LLMs with general hum…