DiscoUQ: Structured Disagreement Analysis for Uncertainty Quantification in LLM Agent Ensembles

ArXi:2603.20975v1 Announce Type: cross Multi-agent LLM systems, where multiple prompted instances of a language model independently answer questions, are increasingly used for complex reasoning tasks. However, existing methods for quantifying the uncertainty of their collective outputs rely on shallow voting statistics that discard the rich semantic information in agents' reasoning. We