Efficient Construction of Model Family through Progressive Training Using Model Expansion

ArXi:2504.00623v2 Announce Type: replace As Large Language Models (LLMs) gain widespread practical application, offering model families with varying parameter sizes has become standard practice to accommodate diverse computational requirements. Traditionally, each model in the family is trained independently, incurring computational costs that scale additively with the number of models. In this work, we propose an efficient method for constructing model families via progressive