Zhiqiu Xu, Shibo Jin, Shreya Arya, Mayur Naik

MathDuels: Evaluating LLMs as Problem Posers and Solvers

Zhiqiu Xu, Shibo Jin, Shreya Arya, Mayur Naik / April 24, 2026

arXiv:2604.21916v1 Announce Type: new
Abstract: As frontier language models attain near-ceiling performance on static mathematical benchmarks, existing evaluations are increasingly unable to differentiate model capabilities, largely because they cast …

Author name: Zhiqiu Xu, Shibo Jin, Shreya Arya, Mayur Naik

MathDuels: Evaluating LLMs as Problem Posers and Solvers