AI RESEARCH

Software Development Life Cycle Perspective: A Survey of Benchmarks for Code Large Language Models and Agents

arXiv CS.AI

ArXi:2505.05283v3 Announce Type: replace-cross Code large language models (CodeLLMs) and agents are increasingly being integrated into complex software engineering tasks spanning the entire Software Development Life Cycle (SDLC). Benchmarking is critical for rigorously evaluating these capabilities. However, despite their growing significance, there remains a lack of comprehensive reviews that examine these benchmarks from an SDLC perspective.