EVM-QuestBench: An Execution-Grounded Benchmark for Natural-Language Transaction Code Generation
arXiv:2601.06565v5 Announce Type: replace
Abstract: Large language models are increasingly applied to various development scenarios. However, in on-chain transaction scenarios, even a minor error can cause irreversible loss for users. Existing evaluat…