AI RESEARCH

Started a video series on building an orchestration layer for LLM post-training [P]

r/MachineLearning

Hi everyone! Context, motivation, a lot of yapping, feel free to skip to TL;DR. A while back I posted here asking [D] What framework do you use for RL post-