AI RESEARCH
Spark Policy Toolkit: Semantic Contracts and Scalable Execution for Policy Learning in Spark
arXiv CS.LG
•
ArXi:2604.25061v1 Announce Type: cross Custom policy-learning pipelines in Spark fail for two coupled systems reasons: rowwise Python execution makes inference impractical, and driver-side candidate materialization makes split search fragile at feature scale. We present Spark Policy Toolkit, a semantics-governed systems toolkit for scalable policy learning in Spark. The toolkit provides two Spark-native primitives: partition-initialized vectorized inference through mapInPandas and mapInArrow, and collect-less split search that scores candidates on executors.