AI RESEARCH
OrthoEraser: Coupled-Neuron Orthogonal Projection for Concept Erasure
arXiv CS.AI
•
ArXi:2603.11493v1 Announce Type: cross Text-to-image (T2I) models face significant safety risks from adversarial induction, yet current concept erasure methods often cause collateral damage to benign attributes when suppressing selected neurons entirely. This occurs because sensitive and benign semantics exhibit non-orthogonal superposition, sharing activation subspaces where their respective vectors are inherently entangled.