Simple Projection-Free Algorithm for Contextual Recommendation with Logarithmic Regret and Robustness

ArXi:2603.20826v1 Announce Type: new Contextual recommendation is a variant of contextual linear bandits in which the learner observes an (optimal) action rather than a reward scalar. Recently, Sakaue developed an efficient Online Newton Step (ONS) approach with an $O(d\log T)$ regret bound, where $d$ is the dimension of the action space and $T$ is the time horizon. In this paper, we present a simple algorithm that is efficient than the ONS-based method while achieving the same regret guarantee.