cs.AI, cs.IT, cs.LG, math.IT

Route Experts by Sequence, not by Token

arXiv:2511.06494v2 Announce Type: replace-cross
Abstract: Mixture-of-Experts (MoE) architectures scale large language models (LLMs) by activating only a subset of experts per token, but the standard TopK routing assigns the same fixed number of expert…

cs.CL

Approaches to Analysing Historical Newspapers Using LLMs

arXiv:2603.25051v2 Announce Type: replace
Abstract: This study presents a computational analysis of the Slovene historical newspapers \textit{Slovenec} and \textit{Slovenski narod} from the sPeriodika corpus, combining topic modelling, large language …

Scroll to Top