AI RESEARCH
Is More Data Worth the Cost? Dataset Scaling Laws in a Tiny Attention-Only Decoder
arXiv CS.LG
•
ArXi:2604.09389v1 Announce Type: new