AI RESEARCH
Distributional Off-Policy Evaluation with Deep Quantile Process Regression
arXiv CS.LG
•
ArXi:2604.18143v1 Announce Type: cross This paper investigates the off-policy evaluation (OPE) problem from a distributional perspective. Rather than focusing solely on the expectation of the total return, as in most existing OPE methods, we aim to estimate the entire return distribution. To this end, we