An Efficient and Effective Evaluator for Text2SQL Models on Unseen and Unlabeled Data

ArXi:2603.07841v1 Announce Type: new Recent advances in large language models has strengthened Text2SQL systems that translate natural language questions into database queries. A persistent deployment challenge is to assess a newly trained Text2SQL system on an unseen and unlabeled dataset when no verified answers are available. This situation arises frequently because database content and structure evolve, privacy policies slow manual review, and carefully written SQL labels are costly and time-consuming.