Code Comprehension then Auditing for Unsupervised LLM Evaluation
arXiv:2410.03131v4 Announce Type: replace-cross
Abstract: Large Language Models (LLMs) for unsupervised code correctness evaluation have recently gained attention because they can judge if code runs as intended without requiring reference implementati…