cs.LG, q-bio.QM

Better Models, Faster Training: Sigmoid Attention for single-cell Foundation Models

arXiv:2604.27124v1 Announce Type: new
Abstract: Training stable biological foundation models requires rethinking attention mechanisms: we find that using sigmoid attention as a drop in replacement for softmax attention a) produces better learned repre…