AI RESEARCH
Sharper Guarantees for Misspecified Kernelized Bandit Optimization
arXiv CS.LG
•
ArXi:2605.05967v1 Announce Type: new Existing guarantees for misspecified kernelized bandit optimization pay for misspecification through kernel complexity: in generic offline bounds, the misspecification level $\varepsilon$ is multiplied by $\sqrt{d_\mathrm{eff}}$, where $d_\mathrm{eff}$ is the kernel effective dimension, while in online regret bounds, the corresponding penalty is $\sqrt{\gamma_n}\,n\varepsilon$, where $\gamma_n$ is the maximum information gain after $n$ rounds of interaction.