AI RESEARCH

Sampling at intermediate temperatures is optimal for training large language models in protein structure prediction

arXiv CS.LG

ArXi:2603.29529v1 Announce Type: cross We investigate the parameter space of transformer models trained on protein sequence data using a statistical mechanics framework, sampling the loss landscape at varying temperatures by Langevin dynamics to characterize the low-loss manifold and understand the mechanisms underlying the superior performance of transformers in protein structure prediction.