cs.AI, cs.CL

Two-dimensional early exit optimisation of LLM inference

arXiv:2604.18592v1 Announce Type: cross
Abstract: We introduce a two-dimensional (2D) early exit strategy that coordinates layer-wise and sentence-wise exiting for classification tasks in large language models. By processing input incrementally senten…