cs.AI, cs.CV

Token-Efficient Multimodal Reasoning via Image Prompt Packaging

arXiv:2604.02492v1 Announce Type: new
Abstract: Deploying large multimodal language models at scale is constrained by token-based inference costs, yet the cost-performance behavior of visual prompting strategies remains poorly characterized. We introd…