SkillFactory: Self-Distillation For Learning Cognitive Behaviors

ArXi:2512.04072v2 Announce Type: replace-cross Reasoning models leveraging long chains of thought employ various cognitive skills, such as verification of their answers, backtracking, retrying by an alternate method, and more. Previous work has shown that when a base language model exhibits these skills