Synthetic Sandbox for Training Machine Learning Engineering Agents
arXiv:2604.04872v1 Announce Type: new
Abstract: As large language model agents advance beyond software engineering (SWE) tasks toward machine learning engineering (MLE), verifying agent behavior becomes orders of magnitude more expensive: while SWE ta…