cs.CL

Polyglot Teachers: Evaluating Language Models for Multilingual Synthetic Data Generation

arXiv:2604.11290v1 Announce Type: new
Abstract: Synthesizing supervised finetuning (SFT) data from language models (LMs) to teach smaller models multilingual tasks has become increasingly common. However, teacher model selection is often ad hoc, typic…