You Fine-Tuned Your Model. Now It’s Worse. Here’s the Concept You Were Never Taught.

Towards AI
Machine Learning Generative AI AI Research

Gradient descent doesn’t know which connections matter. It updates everything equally - and that’s the entire problem. You spent a week fine-tuning a language model on your domain-specific dataset. The eval metrics look great. You deploy it, run a quick sanity check - and it nails the new task. Then someone asks it something basic. Something any pre-trained model could handle in its sleep. It fumbles. That’s not a bug in your code. That’s catastrophic forgetting - and it’s one of the most common, least-explained phenomena in all of applied machine learning.