AI RESEARCH
SMolLM: Small Language Models Learn Small Molecular Grammar
arXiv CS.LG
•
ArXi:2605.06322v1 Announce Type: new Language models for molecular design have scaled to hundreds of millions of parameters, yet how they learn chemical grammar is poorly understood. We train SMolLM, a 53K-parameter weight-shared transformer, to generate novel SMILES with 95% validity on the ZINC-250K drug-like-molecule benchmark, outperforming a standard GPT with 10 times parameters.