Trying to train tiny LLMs on length constrained reddit posts summarization task using GRPO on 3xMac Minis - updates!
r/LocalLLaMA
•
Generative AI
So, here's an update to my GRPO