cs.CL

RewardBench 2: Advancing Reward Model Evaluation

arXiv:2506.01937v2 Announce Type: replace
Abstract: Reward models are used throughout the post-training of language models to capture nuanced signals from preference data and provide a training target for optimization across instruction following, rea…

cs.AI, cs.CL

GiVA: Gradient-Informed Bases for Vector-Based Adaptation

arXiv:2604.21901v1 Announce Type: new
Abstract: As model sizes continue to grow, parameter-efficient fine-tuning has emerged as a powerful alternative to full fine-tuning. While LoRA is widely adopted among these methods, recent research has explored …

Scroll to Top