Physics-R1: An Audited Olympiad Corpus and Recipe for Visual Physics Reasoning
arXiv:2605.14040v1 Announce Type: new
Abstract: We audit the multimodal-physics evaluation pipeline end-to-end and document three undetected construction practices that distort how the field measures vision-language reasoning: train-eval contamination…