VerifyMAS: Hypothesis Verification for Failure Attribution in LLM Multi-Agent Systems

ArXi:2605.17467v1 Announce Type: new Large language model-driven multi-agent systems (LLM-MAS) excel at complex tasks, yet unreliable agents remain a key bottleneck to system-level reliability. Automatic failure attribution is. therefore. critical, but existing approaches, such as direct prediction of agent-error pairs and agent-first failure attribution, rely on local logs of agents and miss global failures that only manifest over full interaction trajectories, such as cross-step inconsistencies and inter-agent coordination errors.