A Single-Sample Polylogarithmic Regret Bound for Nonstationary Online Linear Programming

ArXi:2603.14673v1 Announce Type: cross We study nonstationary Online Linear Programming (OLP), where $n$ orders arrive sequentially with reward-resource consumption pairs that form a sequence of independent, but not necessarily identically distributed, random vectors. At the beginning of the planning horizon, the decision-maker is provided with a resource endowment that is sufficient to fulfill a significant portion of the requests.