cs.CL

Not All Proofs Are Equal: Evaluating LLM Proof Quality Beyond Correctness

arXiv:2605.10379v1 Announce Type: new
Abstract: Large language models (LLMs) have become capable mathematical problem-solvers, often producing correct proofs for challenging problems. However, correctness alone is not sufficient: mathematical proofs s…