cs.CL, cs.LG

Synthetic Sandbox for Training Machine Learning Engineering Agents

arXiv:2604.04872v1 Announce Type: new
Abstract: As large language model agents advance beyond software engineering (SWE) tasks toward machine learning engineering (MLE), verifying agent behavior becomes orders of magnitude more expensive: while SWE ta…