cs.LG

ReBaPL: Repulsive Bayesian Prompt Learning

arXiv:2511.17339v2 Announce Type: replace
Abstract: Prompt learning has emerged as an effective technique for fine-tuning large-scale foundation models for downstream tasks. However, conventional prompt learning methods are prone to overfitting and ca…

cs.AI, cs.IT, cs.LG, math.IT

Route Experts by Sequence, not by Token

arXiv:2511.06494v2 Announce Type: replace-cross
Abstract: Mixture-of-Experts (MoE) architectures scale large language models (LLMs) by activating only a subset of experts per token, but the standard TopK routing assigns the same fixed number of expert…

Scroll to Top