A 760-million-active-parameter MoE that never touched a single NVIDIA H100 in training scored 89.6% on HMMT ’25 math — 1.3 points higher…
A 760-million-active-parameter MoE that never touched a single NVIDIA H100 in training scored 89.6% on HMMT ’25 math — 1.3 points higher…