Evaluating LLM-Based Test Generation Under Software Evolution

ArXi:2603.23443v1 Announce Type: cross Large Language Models (LLMs) are increasingly used for automated unit test generation. However, it remains unclear whether these tests reflect genuine reasoning about program behavior or simply reproduce superficial patterns learned during