Expert Upcycling: Growing MoE capacity mid-training without increasing inference cost (7B→13B, ~32% GPU hours saved)

r/LocalLLaMA • April 23, 2026

AI Hardware AI Research

Author here, sharing a preprint we recently released. We're actively looking for feedback from this community before we revise. Motivation