AI RESEARCH

The Sampling Complexity of Condorcet Winner Identification in Dueling Bandits

arXiv CS.LG

ArXi:2603.15189v1 Announce Type: cross We study best-arm identification in stochastic dueling bandits under the sole assumption that a Condorcet winner exists, i.e., an arm that wins each noisy pairwise comparison with probability at least $1/2$. We