Distinguishable Deletion: Unifying Knowledge Erasure and Refusal for Large Language Model Unlearning

ArXi:2605.16776v1 Announce Type: new Mitigating sensitive and harmful outputs is fundamental to ensuring safe deployment of LLMs. Existing approaches typically follow two paradigms: Knowledge Deletion (KD), which erases undesirable information during