AI RESEARCH

Pocket Foundation Models: Distilling TFMs into CPU-Ready Gradient-Boosted Trees

arXiv CS.LG

ArXi:2605.18654v1 Announce Type: new A fraud scorer needs to answer in under 2 ms. The best tabular foundation models (TFMs) take 151-1,275 ms on GPU. We close this gap by distilling the TFM offline into an XGBoost or CatBoost student that runs natively on CPU. The central obstacle is specific to in-context learning (ICL) teachers: they leak labels when scoring their own