cs.LG

Split-on-Share: Mixture of Sparse Experts for Task-Agnostic Continual Learning

arXiv:2601.17616v2 Announce Type: replace
Abstract: Continual learning in Large Language Models (LLMs) is hindered by the plasticity-stability dilemma, where acquiring new capabilities often leads to catastrophic forgetting of previous knowledge. Exis…