AI RESEARCH
Dynamic batching for Encoder-Decoder MT training or generation when long sequence caps the batch size [P]
r/MachineLearning
•
I built a small pytorch sampler called dynabatch after facing this specific batching issue while fine tuning a NLLB-200 600M model