Dr.LLM: Dynamic Layer Routing in LLMs

ArXi:2510.12773v2 Announce Type: replace-cross Large Language Models (LLMs) process every token through all layers of a transformer stack, causing wasted computation on simple queries and insufficient flexibility for harder ones that need deeper reasoning. Adaptive-depth methods can improve efficiency, but prior approaches rely on costly inference-time search, architectural changes, or large-scale re