Optimsyn: Influence-Guided Rubrics Optimization for Synthetic Data Generation
arXiv:2604.00536v1 Announce Type: new
Abstract: Large language models (LLMs) achieve strong downstream performance largely due to abundant supervised fine-tuning (SFT) data. However, high-quality SFT data in knowledge-intensive domains such as humanit…