AI RESEARCH
LinearARD: Linear-Memory Attention Distillation for RoPE Restoration
arXiv CS.AI
•
ArXi:2604.00004v1 Announce Type: cross The extension of context windows in Large Language Models is typically facilitated by scaling positional encodings followed by lightweight Continual Pre-