AI RESEARCH

Signature Approach for Contextual Bandits with Nonlinear and Path-dependent Rewards

arXiv CS.LG

ArXi:2605.10313v1 Announce Type: new We study contextual bandits with nonlinear and path-dependent rewards through a novel signature-transform-based approach. Leveraging the universal nonlinearity property of signatures, we approximate continuous path-dependent reward functionals by linear functionals in the signature space. This representation enables the use of efficient linear contextual bandit methods while preserving expressive sequential structure. Building on this framework, we propose \texttt{DisSigUCB}, a signature-based disjoint upper confidence bound (UCB) algorithm.