AI RESEARCH

How do you experiment with a (very) large model architecture? [D]

r/MachineLearning

Im trying to reproduce a paper (a very particular kind of diffusion model), and their