LLM-Based Automated Diagnosis Of Integration Test Failures At Google

ArXi:2604.12108v1 Announce Type: cross Integration testing is critical for the quality and reliability of complex software systems. However, diagnosing their failures presents significant challenges due to the massive volume, unstructured nature, and heterogeneity of logs they generate. These result in a high cognitive load, low signal-to-noise ratio, and make diagnosis difficult and time-consuming. Developers complain about these difficulties consistently and report spending substantially time diagnosing integration test failures compared to unit test failures.