AI RESEARCH

Dynamic batching for Encoder-Decoder MT training or generation when long sequence caps the batch size [P]

r/MachineLearning

I built a small pytorch sampler called dynabatch after facing this specific batching issue while fine tuning a NLLB-200 600M model