Not All Proofs Are Equal: Evaluating LLM Proof Quality Beyond Correctness
arXiv:2605.10379v1 Announce Type: new
Abstract: Large language models (LLMs) have become capable mathematical problem-solvers, often producing correct proofs for challenging problems. However, correctness alone is not sufficient: mathematical proofs s…