Consistency Amplifies: How Behavioral Variance Shapes Agent Accuracy

ArXi:2603.25764v1 Announce Type: cross As LLM-based agents are deployed in production systems, understanding their behavioral consistency (whether they produce similar action sequences when given identical tasks) becomes critical for reliability. We study consistency in the context of SWE-bench, a challenging software engineering benchmark requiring complex, multi-step reasoning.