The Finetuner's Fallacy: When to Pretrain with Your Finetuning Data

ArXi:2603.16177v1 Announce Type: new Real-world model deployments demand strong performance on narrow domains where data is often scarce. Typically, practitioners finetune models to specialize them, but this risks overfitting to the domain and forgetting general knowledge. We study a simple strategy, specialized pre