You Fine-Tuned Your Model. Now It’s Worse. Here’s the Concept You Were Never Taught.

Gradient descent doesn’t know which connections matter. It updates everything equally - and that’s the entire problem. You spent a week fine-tuning a language model on your domain-specific dataset. The eval metrics look great. You deploy it, run a quick sanity check - and it nails the new task. Then someone asks it something basic. Something any pre-trained model could handle in its sleep. It fumbles. That’s not a bug in your code. That’s catastrophic forgetting - and it’s one of the most common, least-explained phenomena in all of applied machine learning.