cs.AI, cs.LG

FLUID: Continuous-Time Hyperconnected Sparse Transformer for Sink-Free Learning

arXiv:2605.04421v1 Announce Type: new
Abstract: Continuous-time (CT) Transformers improve irregular and long-range modeling over CT-RNNs by exploiting inputs or outputs embeddings with continuous dynamics. However, the core scaled-dot-product-attentio…