From Static Inference to Dynamic Interaction: A Survey of Streaming Large Language Models
arXiv:2603.04592v3 Announce Type: replace
Abstract: Standard Large Language Models (LLMs) are predominantly designed for static inference with pre-defined inputs, which limits their applicability in dynamic, real-time scenarios. To address this gap, t…