WildFeedback: Aligning LLMs With In-situ User Interactions And Feedback
arXiv:2408.15549v4 Announce Type: replace
Abstract: As large language models (LLMs) continue to advance, aligning these models with human preferences has emerged as a critical challenge. Traditional alignment methods, relying on human or LLM annotated…