AI RESEARCH

Simply Stabilizing the Loop via Fully Looped Transformer

arXiv CS.AI

ArXi:2605.18797v1 Announce Type: cross Scaling model performance typically requires increasing model size. Looped Transformer offers a compelling alternative by iteratively reusing the same Transformer blocks, trading additional computation for improved performance without increasing parameter count or context length. Because the number of loop iterations can be adjusted at inference, it also provides a natural mechanism for balancing performance and test-time compute. However, Looped Transformer still suffers from