Can LLMs Compress (and Decompress)? Evaluating Code Understanding and Execution via Invertibility
arXiv:2601.13398v2 Announce Type: replace-cross
Abstract: LLMs demonstrate strong performance on code benchmarks, yet consistent reasoning across forward and backward execution remains elusive. We present RoundTripCodeEval (RTCE), a benchmark of four …