AI RESEARCH

ResBM: a new transformer-based architecture for low-bandwidth pipeline-parallel training, achieving 128× activation compression [R]

r/MachineLearning

Macrocosmos has released a paper on ResBM (Residual Bottleneck Models), a new transformer-based architecture designed for low-bandwidth pipeline-parallel