NanoGPT Slowrun: 10x Data Efficiency with Infinite Compute (7 minute read)

Researchers achieved 10x data efficiency with NanoGPT Slowrun, a benchmark for language modeling algorithms in the infinite compute, within a few weeks. Data efficiency matters because compute grows much faster than data. Intelligence will eventually be bottlenecked by data, not compute. This data efficiency result allows researchers to improve model performance by scaling with compute rather than with data.