AI RESEARCH
End-to-End Efficient RL for Linear Bellman Complete MDPs with Deterministic Transitions
arXiv CS.LG
•
ArXi:2603.23461v1 Announce Type: new We study reinforcement learning (RL) with linear function approximation in Marko Decision Processes (MDPs) satisfying \emph{linear Bellman completeness} -- a fundamental setting where the Bellman backup of any linear value function remains linear. While statistically tractable, prior computationally efficient algorithms are either limited to small action spaces or require strong oracle assumptions over the feature space.