Graph Memory Transformer (GMT)
arXiv:2604.23862v1 Announce Type: cross
Abstract: We investigate whether the Feed-Forward Network (FFN) sublayer in a decoder-only transformer can be replaced by an explicit learned memory graph while preserving the surrounding autoregressive architec…