Keep What Audio Cannot Say: Context-Preserving Token Pruning for Omni-LLMs
arXiv:2605.11605v1 Announce Type: new
Abstract: Omnimodal Large Language Models (Omni-LLMs) incur substantial computational overhead due to the large number of multimodal input tokens they process, making token reduction essential for real-world deplo…