AI RESEARCH

Applying Karpathy's autoresearch to a 33M-token public transit dataset (14% improvement, replication notes) [P]

r/MachineLearning

Hello r/MachineLearning! I work in the US transit industry and I went all-in on learning AI & ML a few months ago. When I heard about Andrej Karpathy's autoresearch framework, I thought it was really cool. I decided to use the same transit dataset from an earlier GPT-2 XL fine-tuning project to train a small 80M model from scratch. Autoresearch is designed for from-scratch pre