AI RESEARCH

Atoms as Language: VQ-Atom: Semantic Discretization for Molecular Representation Learning

arXiv CS.LG

ArXi:2605.16823v1 Announce Type: new Molecular representation learning has become a central approach in AI-driven drug discovery, yet existing molecular tokenizations such as SMILES remain largely syntactic and do not naturally align with chemically meaningful substructures. In this work, we We evaluate VQ-Atom in protein-ligand interaction prediction under a protein-cold split setting without relying on 3D structural information.