SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving

ArXi:2505.23932v3 Announce Type: replace We present SwingArena, a competitive evaluation framework for Large Language Models (LLMs) that closely mirrors real-world software development workflows. Unlike traditional static benchmarks, SwingArena models the collaborative process of software iteration by pairing LLMs as submitters, who generate patches, and reviewers, who create test cases and verify the patches through continuous integration (CI) pipelines. To these interactive evaluations, we.