Knowledge is Not Enough: Injecting RL Skills for Continual Adaptation

ArXi:2601.11258v2 Announce Type: replace-cross Large Language Models (LLMs) face the "knowledge cutoff" challenge, where their frozen parametric memory prevents direct internalization of new information. While Supervised Fine-Tuning (SFT) is commonly used to update model knowledge, it often updates factual content without reliably improving the model's ability to use the newly incorporated information for question answering or decision-making.