The Synthetic Data Playbook: Generating Trillions of the Finest Tokens
r/LocalLLaMA
•
Generative AI
AI Tools
Hugging Face just released the Synthetic Data Playbook: They generated over a 1T tokens in 90 experiments with 100k+ GPUh to figure out what makes good synthetic data and how to generate it at scale submitted by /u/joelinho95 [link] [comments]