cs.AI, cs.LG, cs.PL

Can LLMs Compress (and Decompress)? Evaluating Code Understanding and Execution via Invertibility

arXiv:2601.13398v2 Announce Type: replace-cross
Abstract: LLMs demonstrate strong performance on code benchmarks, yet consistent reasoning across forward and backward execution remains elusive. We present RoundTripCodeEval (RTCE), a benchmark of four …