Always Learning, Always Mixing: Efficient and Simple Data Mixing All The Time
arXiv:2605.15220v1 Announce Type: cross
Abstract: Data mixing decides how to combine different sources or types of data and is a consequential problem throughout language model training. In pretraining, data composition is a key determinant of model q…