Probing Structural Mathematical Reasoning in Language Models with Algebraic Trapdoors
arXiv:2605.04352v1 Announce Type: new
Abstract: We introduce a benchmark suite for evaluating structural mathematical reasoning in language models, built on subgroup-construction problems in SL(3, Z) with cryptographic-style verifier-prover asymmetry….