AI RESEARCH

Educational PyTorch repo for distributed training from scratch: DP, FSDP, TP, FSDP+TP, and PP [P]

r/MachineLearning

I put together a small educational repo that implements distributed