When Prompts Interact: Assessing Prompt Arithmetic for Deconfounding under Distribution Shift

ArXi:2605.03096v1 Announce Type: new In classification tasks, models may rely on confounding variables to achieve strong in-distribution performance, capturing spurious features that fail under distribution shift. This shortcut behavior leads to substantial degradation in out-of-distribution settings. Task arithmetic offers a potential solution by removing unwanted signals via subtraction of secondary model updates, but it typically requires full fine-tuning, which is computationally expensive.