Contraction-Aligned Analysis of Soft Bellman Residual Minimization with Weighted Lp-Norm for Markov Decision Problem

ArXi:2604.06837v1 Announce Type: new The problem of solving Marko decision processes under function approximation remains a fundamental challenge, even under linear function approximation settings. A key difficulty arises from a geometric mismatch: while the Bellman optimality operator is contractive in the Linfty-norm, commonly used objectives such as projected value iteration and Bellman residual minimization rely on L2-based formulations.