AI RESEARCH
Inference on Optimal Policy Values and Other Irregular Functionals via Softmax Smoothing
arXiv CS.LG
•
ArXi:2507.11780v2 Announce Type: replace-cross Constructing confidence intervals for the value of an (unknown) optimal treatment policy is a fundamental problem in causal inference. Insight into the optimal policy value can guide the development of reward-maximizing, individualized treatment regimes. However, because the functional that defines the optimal value is non-differentiable, standard semi-parametric approaches for performing inference fail to be directly applicable.