cond-mat.dis-nn, cs.LG, q-bio.BM

Sampling at intermediate temperatures is optimal for training large language models in protein structure prediction

arXiv:2603.29529v1 Announce Type: cross
Abstract: We investigate the parameter space of transformer models trained on protein sequence data using a statistical mechanics framework, sampling the loss landscape at varying temperatures by Langevin dynami…