AI RESEARCH
The Long Delay to Arithmetic Generalization: When Learned Representations Outrun Behavior
arXiv CS.AI
•
ArXi:2604.13082v1 Announce Type: cross Grokking in transformers trained on algorithmic tasks is characterized by a long delay between