AI RESEARCH

Queue Length Regret Bounds for Contextual Queueing Bandits

arXiv CS.LG

We introduce contextual queueing bandits, a new context-aware framework for scheduling while simultaneously learning unknown service rates. Individual jobs carry heterogeneous contextual features, based on which the agent chooses a job and matches it with a server to maximize the departure rate. To evaluate the performance of a policy, we consider queue length regret, d