AI RESEARCH
Finding and Reactivating Post-Trained LLMs' Hidden Safety Mechanisms
arXiv CS.AI
•
ArXi:2604.00012v1 Announce Type: cross Despite the impressive performance of general-purpose large language models (LLMs), they often require fine-tuning or post-