Exploratory Optimal Stopping: A Singular Control Formulation

ArXi:2408.09335v3 Announce Type: replace-cross This paper explores continuous-time and state-space optimal stopping problems from a reinforcement learning perspective. We begin by formulating the stopping problem using randomized stopping times, where the decision maker's control is represented by the probability of stopping within a given time-specifically, a bounded, non-decreasing, c\`adl\`ag control process. To encourage exploration and facilitate learning, we