cs.AI, cs.DB

ROSE: An Intent-Centered Evaluation Metric for NL2SQL

arXiv:2604.12988v1 Announce Type: cross
Abstract: Execution Accuracy (EX), the widely used metric for evaluating the effectiveness of Natural Language to SQL (NL2SQL) solutions, is becoming increasingly unreliable. It is sensitive to syntactic variati…