TAPO: Translation Augmented Policy Optimization for Multilingual Mathematical Reasoning
arXiv:2603.25419v1 Announce Type: new
Abstract: Large Language Models (LLMs) have demonstrated remarkable proficiency in English mathematical reasoning, yet a significant performance disparity persists in multilingual contexts, largely attributed to d…