AI RESEARCH

Undetectable Backdoors in Model Parameters: Hiding Sparse Secrets in High Dimensions

arXiv CS.AI

ArXi:2605.04209v1 Announce Type: cross We present Sparse Backdoor, a supply-chain attack that plants a \emph{provably undetectable} backdoor in pre-trained image classifiers, including convolutional networks and Vision Transformers. The attack injects a structured sparse perturbation along a randomly chosen direction into a small subset of columns at each fully connected layer, propagating a trigger signal to an adversary-chosen target class, and masks the perturbation with an independent isotropic Gaussian dither.