Sampling at intermediate temperatures is optimal for training large language models in protein structure prediction
arXiv:2603.29529v1 Announce Type: cross
Abstract: We investigate the parameter space of transformer models trained on protein sequence data using a statistical mechanics framework, sampling the loss landscape at varying temperatures by Langevin dynami…